Patents by Inventor Roger L. Allen

Roger L. Allen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9030480
    Abstract: One embodiment of the present invention sets forth a method for analyzing the performance of a graphics processing pipeline. A first workload and a second workload are combined together in a pipeline to generate a combined workload. The first workload is associated with a first instance and the second workload is associated with a second instance. A first and second initial event are generated for the combined workload, indicating that the first and second workloads have begun processing at a first position in the graphics processing pipeline. A first and second final event are generated, indicating that the first and second workloads have finished processing at a second position in the graphics processing pipeline.
    Type: Grant
    Filed: December 18, 2012
    Date of Patent: May 12, 2015
    Assignee: NVIDIA Corporation
    Inventors: Roger L. Allen, Ziyad S. Hakura, Thomas Melvin Ogletree
  • Patent number: 8928676
    Abstract: In a raster stage of a graphics processor, a method for parallel fine rasterization. The method includes receiving a graphics primitive for rasterization in a raster stage of a graphics processor. The graphics primitive is rasterized at a first level to generate a plurality of tiles of pixels. The titles are subsequently rasterized at a second level by allocating the tiles to an array of parallel second-level rasterization units to generate covered pixels. The covered pixels are then output for rendering operations in a subsequent stage of the graphics processor.
    Type: Grant
    Filed: June 23, 2006
    Date of Patent: January 6, 2015
    Assignee: Nvidia Corporation
    Inventors: Walter R. Steiner, Franklin C. Crow, Craig M. Wittenbrink, Roger L. Allen, Douglas A. Voorhies
  • Publication number: 20140168231
    Abstract: One embodiment of the present invention sets forth a method for analyzing the performance of a graphics processing pipeline. A first workload and a second workload are combined together in a pipeline to generate a combined workload. The first workload is associated with a first instance and the second workload is associated with a second instance. A first and second initial event are generated for the combined workload, indicating that the first and second workloads have begun processing at a first position in the graphics processing pipeline. A first and second final event are generated, indicating that the first and second workloads have finished processing at a second position in the graphics processing pipeline.
    Type: Application
    Filed: December 18, 2012
    Publication date: June 19, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Roger L. ALLEN, Ziyad S. HAKURA, Thomas Melvin OGLETREE
  • Patent number: 8379033
    Abstract: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.
    Type: Grant
    Filed: February 17, 2012
    Date of Patent: February 19, 2013
    Assignee: NVIDIA Corporation
    Inventors: Steven E. Molnar, Cass W. Everitt, Roger L. Allen, Gary M. Tarolli, John M. Danskin
  • Patent number: 8253748
    Abstract: One embodiment of a system for collecting performance data for a multithreaded processing unit includes a plurality of independent performance registers, each configured to count hardware-based and/or software-based events. Functional blocks within the multithreaded processing unit are configured to generate various event signals, and subsets of the events are selected and used to generate one or more functions, each of which increments one of the performance registers. By accessing the contents of the performance registers, a user may observe and characterize the behavior of the different functional blocks within the multithreaded processing unit when one or more threads are executed within the processing unit. The contents of the performance registers may also be used to modify the behavior of the program running on the multithreaded processing unit, to modify a global performance register or to trigger an interrupt.
    Type: Grant
    Filed: November 29, 2005
    Date of Patent: August 28, 2012
    Assignee: NVIDIA Corporation
    Inventors: Roger L. Allen, Brett W. Coon
  • Patent number: 8212824
    Abstract: A graphics processing unit includes a first processing controller controlling a first set of multi-threaded processors. A second processing controller controls a second set of multi-threaded processors. A serial bus connects the first processing controller to the second processing controller. The first processing controller gathers first state information from the first set of multi-threaded processors in response to a context switch token and then passes the context switch token over the serial bus to the second processing controller. The second processing controller gathers second state information from the second set of multi-threaded processors in response to the context switch token.
    Type: Grant
    Filed: December 19, 2005
    Date of Patent: July 3, 2012
    Assignee: Nvidia Corporation
    Inventors: Roger L. Allen, Nitij Mangal
  • Publication number: 20120147027
    Abstract: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.
    Type: Application
    Filed: February 17, 2012
    Publication date: June 14, 2012
    Inventors: Steven E. MOLNAR, Cass W. Everitt, Roger L. Allen, Gary M. Tarolli, John M. Danskin
  • Patent number: 8139069
    Abstract: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method for managing a plurality of independently processed texture streams in a parallel rendering system that includes the steps of maintaining a time stamp for a group of tiles of work that are associated with each of the plurality of the texture streams and are associated with a specified area in screen space, and utilizing the time stamps to counter divergences in the independent processing of the plurality of texture streams.
    Type: Grant
    Filed: November 3, 2006
    Date of Patent: March 20, 2012
    Assignee: NVIDIA Corporation
    Inventors: Steven E. Molnar, Cass W. Everitt, Roger L. Allen, Gary M. Tarolli, John M. Danskin
  • Patent number: 8134570
    Abstract: A system, method and computer program product are provided for packing graphics attributes. In use, a plurality of graphics attributes is identified. Such graphics attributes are packed, such that the packed graphics attributes are capable of being processed utilizing a pixel shader.
    Type: Grant
    Filed: September 18, 2006
    Date of Patent: March 13, 2012
    Assignee: NVIDIA Corporation
    Inventors: Jerome F. Duluk, Jr., Andrew J. Tao, Roger L. Allen, Svetoslav D. Tzvetkov, Yan Yan Tang, Elena M. Ing
  • Patent number: 8094158
    Abstract: Systems and methods for using multiple versions of programmable constants within a multi-threaded processor allow a programmable constant to be changed before a program using the constants has completed execution. Processing performance may be improved since programs using different values for a programmable constant may execute simultaneously. The programmable constants are stored in a constant buffer and an entry of a constant buffer table is bound to the constant buffer. When a programmable constant is changed it is copied to an entry in a page pool and address translation for the page pool is updated to correspond to the old version (copy) of the programmable constant. An advantage is that the constant buffer stores the newest version of the programmable constant.
    Type: Grant
    Filed: January 31, 2006
    Date of Patent: January 10, 2012
    Assignee: NVIDIA Corporation
    Inventors: Roger L. Allen, Cass W. Everitt, Henry P. Moreton, Thomas H. Kong
  • Patent number: 8085272
    Abstract: A method and system for improving data coherency in a parallel rendering system is disclosed. Specifically, one embodiment of the present invention sets forth a method, which includes the steps of receiving a common input stream, tracking a periodic event associated with the common input stream, generating a plurality of fragment streams from the common input stream, inserting a marker based on an occurrence of the periodic event in a first fragment stream in the multiple fragment streams, and utilizing the marker to influence the processing of the first fragment stream so that a plurality of raster operation (ROP) request streams maintains substantially the same coherence as the common input stream. Each fragment stream is independently processed and corresponds to one of the ROP request streams.
    Type: Grant
    Filed: November 3, 2006
    Date of Patent: December 27, 2011
    Assignee: NVIDIA Corporation
    Inventors: Steven E. Molnar, Cass W. Everitt, Roger L. Allen, Gary M. Tarolli, John M. Danskin, Adam Clark Weitkemper, Mark J. French
  • Patent number: 7911471
    Abstract: A method and apparatus for executing loop and branch program instructions in a programmable graphics shader. The programmable graphics shader converts a sequence of instructions comprising a portion of a shader program and selects a first set of fragments to be processed. Subsequent sequences of instructions are converted until all of the instructions comprising the shader program have been executed on the first set of fragments. Each remaining set of fragments is processed by the shader program until all of the fragments are processed in the same manner. Furthermore, the instructions can contain one or more loop or branch program instructions that are conditionally executed. Additionally, when instructions within a loop as defined by a loop instruction are being executed a current loop count is pipelined through the programmable graphics shader and used as an index to access graphics memory.
    Type: Grant
    Filed: October 8, 2004
    Date of Patent: March 22, 2011
    Assignee: NVIDIA Corporation
    Inventors: Roger L. Allen, Harold Robert Feldman Zatz
  • Patent number: 7877565
    Abstract: Systems and methods for using multiple versions of programmable constants within a multi-threaded processor allow a programmable constant to be changed before a program using the constants has completed execution. Processing performance may be improved since programs using different values for a programmable constant may execute simultaneously. The programmable constants are stored in a constant buffer and an entry of a constant buffer table is bound to the constant buffer. When a programmable constant is changed it is copied to an entry in a page pool and address translation for the page pool is updated to correspond to the old version (copy) of the programmable constant. An advantage is that the constant buffer stores the newest version of the programmable constant.
    Type: Grant
    Filed: January 31, 2006
    Date of Patent: January 25, 2011
    Assignee: NVIDIA Corporation
    Inventors: Roger L. Allen, Cass W. Everitt, Henry Packard Moreton, Thomas H. Kong, Simon S. Moy
  • Patent number: 7852340
    Abstract: A scalable shader architecture is disclosed. In accord with that architecture, a shader includes multiple shader pipelines, each of which can perform processing operations on rasterized pixel data. Shader pipelines can be functionally removed as required, thus preventing a defective shader pipeline from causing a chip rejection. The shader includes a shader distributor that processes rasterized pixel data and then selectively distributes the processed rasterized pixel data to the various shader pipelines, beneficially in a manner that balances workloads. A shader collector formats the outputs of the various shader pipelines into proper order to form shaded pixel data. A shader instruction processor (scheduler) programs the individual shader pipelines to perform their intended tasks.
    Type: Grant
    Filed: December 14, 2007
    Date of Patent: December 14, 2010
    Assignee: NVIDIA Corporation
    Inventors: Rui M. Bastos, Karim M. Abdalla, Christian Rouet, Michael J.M. Toksvig, Johnny S Rhoades, Roger L. Allen, John Douglas Tynefield, Jr., Emmett M. Kilgariff, Gary M. Tarolli, Brian Cabral, Craig Michael Wittenbrink, Sean J. Treichler
  • Patent number: 7809928
    Abstract: One embodiment of an instruction decoder includes an instruction parser configured to process a first non-operative instruction and to generate a first event signal corresponding to the first non-operative instruction, and a first event multiplexer configured to receive the first event signal from the instruction parser, to select the first event signal from one or more event signals and to transmit the first event signal to an event logic block. The instruction decoder may be implemented in a multithreaded processing unit, such as a shader unit, and the occurrences of the first event signal may be tracked when one or more threads are executed within the processing unit. The resulting event signal count may provide a designer with a better understanding of the behavior of a program, such as a shader program, executed within the processing unit, thereby facilitating overall processing unit and program design.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: October 5, 2010
    Assignee: NVIDIA Corporation
    Inventors: Roger L. Allen, Brett W. Coon, Ian A. Buck, John R. Nickolls
  • Patent number: 7711990
    Abstract: A system includes a graphics processing unit with a processor responsive to a debug instruction that initiates the storage of execution state information. A memory stores the execution state information. A central processing unit executes a debugging program to analyze the execution state information.
    Type: Grant
    Filed: December 13, 2005
    Date of Patent: May 4, 2010
    Assignee: Nvidia Corporation
    Inventors: John R. Nickolls, Roger L. Allen, Brian K. Cabral, Brett W. Coon, Robert C. Keller
  • Patent number: 7663621
    Abstract: Circuits, methods, and apparatus that perform cylindrical wrapping in software without the need for a dedicated hardware circuit. One example performs cylindrical wrapping in software running on shader hardware. In one specific example, the shader hardware is a unified shader that alternately processes geometry, vertex, and fragment information. This unified shader is formed using a number of single-instruction, multiple-data units. Another example provides a method of performing a cylindrical wrap that ensures that a correct texture portion is used for a triangle that is divided by a “seam” of the wrap. To achieve this, primitive vertices are sorted such that results are vertex order invariant. One vertex is selected as a reference. For the other vertices, a difference is found for each coordinate and a corresponding coordinate of the reference vertex. If the coordinates are near, no change is made. If the coordinates are distant, the coordinate is adjusted.
    Type: Grant
    Filed: November 3, 2006
    Date of Patent: February 16, 2010
    Assignee: NVIDIA Corporation
    Inventors: Roger L. Allen, Harold Robert Zable, Robert Ohannessian, Jr.
  • Patent number: 7600155
    Abstract: A system has a graphics processing unit with a processor to monitor selected criteria and circuitry to initiate the storage of execution state information when the selected criteria reaches a specified state. A memory stores execution state information. A central processing unit executes a debugging program to analyze the execution state information.
    Type: Grant
    Filed: December 13, 2005
    Date of Patent: October 6, 2009
    Assignee: NVIDIA Corporation
    Inventors: John R. Nickolls, Roger L. Allen, Brian K. Cabral, Brett W. Coon, Robert C. Keller
  • Patent number: 7542042
    Abstract: A new method of operating a fragment shader to produce complex video content comprised of a video image or images, such as from a DVD player, that overlays a fragment shader-processed background. Pixels are fragment shader-processed during one loop or set of loops through a texture processing stations to produce a fragment shader-processed background. Then, at least some of those pixels are merged with the video or images to produce complex video content. The resulting complex image is then made available for further processing.
    Type: Grant
    Filed: November 10, 2004
    Date of Patent: June 2, 2009
    Assignee: NVIDIA Corporation
    Inventors: Roger L. Allen, Rui M. Bastos, Karim M. Abdalla, Justin S. Legakis
  • Patent number: 7439979
    Abstract: A shader having a cache memory for storing program instructions is described. The cache memory beneficially stores both current programming instructions for a fragment program being run and “look-ahead” programming instructions. The cache memory supports a scheduler that forms program commands that control programmable processing stations. The cache memory can store multiple programming instructions for a plurality of shaders. If the cache memory does not include the desired programming instructions, a miss is asserted and a scheduler (instruction processor) recovers the programming instructions to be run. Beneficially, the scheduler recovers additional programming instructions to support the look-ahead programming. The cache memory stores program instructions by cachelines, where each cacheline comprises a plurality of programming instructions. The cache memory can also store program identifiers.
    Type: Grant
    Filed: November 10, 2004
    Date of Patent: October 21, 2008
    Assignee: NVIDIA Corporation
    Inventors: Roger L. Allen, Chad D. Walker, Rui M. Bestos