Patents by Inventor Stephen D. Lew

Stephen D. Lew has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7965895
    Abstract: Methods, circuits, and apparatus for reducing memory bandwidth used by a graphics processor. Uncompressed tiles are read from a display buffer portion of a graphics memory and received by an encoder. The uncompressed tiles are compressed and written back to the graphics memory. When a tile is needed again before it has been modified, the compressed version is read from memory, uncompressed, and displayed. To reduce the number of unnecessary writes of compressed tiles to memory, a tile is only written to memory if it has remained static for some number of refresh cycles. Also, to prevent a large number of compressed tiles being written to the display buffer in one refresh cycle, the encoder can be throttled after a number of tiles have been written. Validity information can be stored for use by a CRTC. If a tile is updated, the validity information is updated such that invalid compressed data is not read from memory and displayed.
    Type: Grant
    Filed: August 10, 2007
    Date of Patent: June 21, 2011
    Assignee: NVIDIA Corporation
    Inventors: John M. Danskin, Ziyad S. Hakura, Edward L. Riegelsberger, Jason M. Musicer, Stephen D. Lew
  • Patent number: 7937567
    Abstract: Parallelism in a parallel processing subsystem is exploited in a scalable manner. A problem to be solved can be hierarchically decomposed into at least two levels of sub-problems. Individual threads of program execution are defined to solve the lowest-level sub-problems. The threads are grouped into one or more thread arrays, each of which solves a higher-level sub-problem. The thread arrays are executable by processing cores, each of which can execute at least one thread array at a time. Thread arrays can be grouped into grids of independent thread arrays, which solve still higher-level sub-problems or an entire problem. Thread arrays within a grid, or entire grids, can be distributed across all of the available processing cores as available in a particular system implementation.
    Type: Grant
    Filed: November 1, 2006
    Date of Patent: May 3, 2011
    Assignee: Nvidia Corporation
    Inventors: John R. Nickolls, Stephen D. Lew
  • Publication number: 20110087860
    Abstract: Parallel data processing systems and methods use cooperative thread arrays (CTAs), i.e., groups of multiple threads that concurrently execute the same program on an input data set to produce an output data set. Each thread in a CTA has a unique identifier (thread ID) that can be assigned at thread launch time. The thread ID controls various aspects of the thread's processing behavior such as the portion of the input data set to be processed by each thread, the portion of an output data set to be produced by each thread, and/or sharing of intermediate results among threads. Mechanisms for loading and launching CTAs in a representative processing core and for synchronizing threads within a CTA are also described.
    Type: Application
    Filed: December 17, 2010
    Publication date: April 14, 2011
    Applicant: NVIDIA Corporation
    Inventors: John R. Nickolls, Stephen D. Lew
  • Patent number: 7916153
    Abstract: Embodiments of the present invention generally provide m Methods and apparatus for reducing power consumption of backlit displays are described. Power consumption is reduced by dimming backlighting by a first scale factor and boosting pixel values by a second scale factor to compensate for the dimming. The scale factors may be constant values. Alternately, one or both of the scale factors may be determined based on pixel values for one or more frames to be displayed and/or one or more frames that have been displayed. For example, scale factors may be calculated based on an average linear amplitude of one or more frames of pixel values or from a maximum pixel value of one or more frames of pixel values. A graphical processing system is described including an integrated circuit capable of transforming a pixel value from a gamma-compensated space to a linear space.
    Type: Grant
    Filed: December 12, 2007
    Date of Patent: March 29, 2011
    Assignee: NVIDIA Corporation
    Inventors: Stephen D. Lew, Michael A. Ogrinc
  • Patent number: 7898545
    Abstract: An integrated circuit includes at least two different types of processors. At least one operation is supported by both types of processors, which permits a commonly supported operation to be scheduled on either processor.
    Type: Grant
    Filed: December 14, 2004
    Date of Patent: March 1, 2011
    Assignee: Nvidia Corporation
    Inventors: Jonah M. Alben, Stephen D. Lew, Paolo E. Sabella
  • Patent number: 7876378
    Abstract: Video filtering using a programmable graphics processor is described. The programmable graphics processor may be programmed to complete a plurality of video filtering operations in a single pass through a fragment-processing pipeline within the programmable graphics processor. Video filtering functions such as deinterlacing, chroma up-sampling, scaling, and deblocking may be performed by the fragment-processing pipeline. The fragment-processing pipeline may be programmed to perform motion adaptive deinterlacing, wherein a spatially variant filter determines, on a pixel basis, whether a “bob”, a “blend”, or a “weave” operation should be used to process an interlaced image.
    Type: Grant
    Filed: December 14, 2007
    Date of Patent: January 25, 2011
    Assignee: NVIDIA Corporation
    Inventors: Stephen D. Lew, Garry W. Amann, Hassane S. Azar
  • Patent number: 7861060
    Abstract: Parallel data processing systems and methods use cooperative thread arrays (CTAs), i.e., groups of multiple threads that concurrently execute the same program on an input data set to produce an output data set. Each thread in a CTA has a unique identifier (thread ID) that can be assigned at thread launch time. The thread ID controls various aspects of the thread's processing behavior such as the portion of the input data set to be processed by each thread, the portion of an output data set to be produced by each thread, and/or sharing of intermediate results among threads. Mechanisms for loading and launching CTAs in a representative processing core and for synchronizing threads within a CTA are also described.
    Type: Grant
    Filed: December 15, 2005
    Date of Patent: December 28, 2010
    Assignee: NVIDIA Corporation
    Inventors: John R. Nickolls, Stephen D. Lew
  • Patent number: 7852412
    Abstract: Circuits, methods, and apparatus for measuring a video signal's noise level. The determination can be made based on pixel values for a single video image frame, for example, by comparing pixel color values, luminance, or other parameter for a first and second group of pixels in the frame. Each group of pixels may be part of a line in the frame, and several such measurements may be made along each line of the frame. These measurements can then be further refined depending on the measure noise level. Once a video noise level is determined, a decision on how to further process the video signal can be made. For example, the picture can be filtered or sharpened. The amount of noise filtering can be made dependent on the amount of noise measured.
    Type: Grant
    Filed: February 27, 2006
    Date of Patent: December 14, 2010
    Assignee: NVIDIA Corporation
    Inventors: Miguel A. Guerrero, Stephen D. Lew, Gerrit A. Slavenburg
  • Patent number: 7788468
    Abstract: A “cooperative thread array,” or “CTA,” is a group of multiple threads that concurrently execute the same program on an input data set to produce an output data set. Each thread in a CTA has a unique thread identifier assigned at thread launch time that controls various aspects of the thread's processing behavior such as the portion of the input data set to be processed by each thread, the portion of an output data set to be produced by each thread, and/or sharing of intermediate results among threads. Different threads of the CTA are advantageously synchronized at appropriate points during CTA execution using a barrier synchronization technique in which barrier instructions in the CTA program are detected and used to suspend execution of some threads until a specified number of other threads also reaches the barrier point.
    Type: Grant
    Filed: December 15, 2005
    Date of Patent: August 31, 2010
    Assignee: NVIDIA Corporation
    Inventors: John R. Nickolls, Stephen D. Lew, Brett W. Coon, Peter C. Mills
  • Patent number: 7779191
    Abstract: A system and method for transitions a computing system between operating modes that have different power consumption characteristics. When a system management unit (SMU) determines that the computing system is in a low activity state, the SMU transitions the central processing unit (CPU) into a low power operating mode after the CPU stores critical operating state of the CPU in a memory. The SMU then intercepts and processes interrupts intended for the CPU, modifying a copy of the critical operating state. This effectively extends the time during which the CPU stays in lower power mode. When the SMU determines that the computing system exits a low activity state, the copy of the critical operating state is stored in the memory and the SMU transitions the CPU into a high power operating mode using the modified critical operating state.
    Type: Grant
    Filed: July 29, 2008
    Date of Patent: August 17, 2010
    Assignee: NVIDIA Corporation
    Inventors: Chien-Ping Lu, Stephen D. Lew, Robert William Chapman
  • Patent number: 7733419
    Abstract: Video filtering using a programmable graphics processor is described. The programmable graphics processor may be programmed to complete a plurality of video filtering operations in a single pass through a fragment-processing pipeline within the programmable graphics processor. Video filtering functions such as deinterlacing, chroma up-sampling, scaling, and deblocking may be performed by the fragment-processing pipeline. The fragment-processing pipeline may be programmed to perform motion adaptive deinterlacing, wherein a spatially variant filter determines, on a pixel basis, whether a “bob”, a “blend”, or a “weave” operation should be used to process an interlaced image.
    Type: Grant
    Filed: December 14, 2007
    Date of Patent: June 8, 2010
    Assignee: Nvidia Corporation
    Inventors: Stephen D. Lew, Garry W. Amann, Hassane S. Azar
  • Patent number: 7705915
    Abstract: Video filtering using a programmable graphics processor is described. The programmable graphics processor may be programmed to complete a plurality of video filtering operations in a single pass through a fragment-processing pipeline within the programmable graphics processor. Video filtering functions such as deinterlacing, chroma up-sampling, scaling, and deblocking may be performed by the fragment-processing pipeline. The fragment-processing pipeline may be programmed to perform motion adaptive deinterlacing, wherein a spatially variant filter determines, on a pixel basis, whether a “bob”, a “blend”, or a “weave” operation should be used to process an interlaced image.
    Type: Grant
    Filed: December 14, 2007
    Date of Patent: April 27, 2010
    Assignee: NVIDIA Corporation
    Inventors: Stephen D. Lew, Garry W. Amann, Hassane S. Azar
  • Publication number: 20100031071
    Abstract: A system and method for transitions a computing system between operating modes that have different power consumption characteristics. When a system management unit (SMU) determines that the computing system is in a low activity state, the SMU transitions the central processing unit (CPU) into a low power operating mode after the CPU stores critical operating state of the CPU in a memory. The SMU then intercepts and processes interrupts intended for the CPU, modifying a copy of the critical operating state. This effectively extends the time during which the CPU stays in lower power mode. When the SMU determines that the computing system exits a low activity state, the copy of the critical operating state is stored in the memory and the SMU transitions the CPU into a high power operating mode using the modified critical operating state.
    Type: Application
    Filed: July 29, 2008
    Publication date: February 4, 2010
    Inventors: Chien-Ping LU, Stephen D. Lew, Robert William Chapman
  • Patent number: 7624224
    Abstract: A system, method, and computer program product are provided for directly executing code in block-based memory, which resides in communication with a processor and a controller. Utilizing the controller, a request is received from the processor for a subset of a block of data in the block-based memory, and at least a portion of the block is retrieved from the block-based memory. After the retrieval, at least a portion of the block is stored in a cache. The subset of the block is then transmitted to the processor, utilizing the controller. To this end, code in the block-based memory is directly executed.
    Type: Grant
    Filed: December 13, 2005
    Date of Patent: November 24, 2009
    Assignee: NVIDIA Corporation
    Inventors: Shang-Tse Chuang, Stephen D. Lew, Gerrit A. Slavenburg
  • Patent number: 7619687
    Abstract: Video filtering using a programmable graphics processor is described. The programmable graphics processor may be programmed to complete a plurality of video filtering operations in a single pass through a fragment-processing pipeline within the programmable graphics processor. Video filtering functions such as deinterlacing, chroma up-sampling, scaling, and deblocking may be performed by the fragment-processing pipeline. The fragment-processing pipeline may be programmed to perform motion adaptive deinterlacing, wherein a spatially variant filter determines, on a pixel basis, whether a “bob”, a “blend”, or a “weave” operation should be used to process an interlaced image.
    Type: Grant
    Filed: December 14, 2007
    Date of Patent: November 17, 2009
    Assignee: NVIDIA Corporation
    Inventors: Stephen D. Lew, Garry W. Amann, Hassane S. Azar
  • Patent number: 7577762
    Abstract: A system and method schedules command streams for processing by a variety of consumers. A single command stream is parsed and commands included in the command stream are output to one of the variety of consumers at a time. A pre-emptive scheduling mechanism is used so that a first consumer may yield to a second consumer when the first consumer has received a sufficient amount of commands. The pre-emptive scheduling enables several of the consumers to process commands concurrently. The pre-emptive scheduling mechanism may be implemented by a device driver inserting yield commands into the command stream or by a unit parsing the command stream.
    Type: Grant
    Filed: February 1, 2005
    Date of Patent: August 18, 2009
    Assignee: NVIDIA Corporation
    Inventors: Lincoln G. Garlick, Scott R. Whitman, Stephen D. Lew
  • Patent number: 7526634
    Abstract: Systems and methods for synchronizing processing work performed by threads, cooperative thread arrays (CTAs), or “sets” of CTAs. A central processing unit can load launch commands for a first set of CTAs and a second set of CTAs in a pushbuffer, and specify a dependency of the second set upon completion of execution of the first set. A parallel or graphics processor (GPU) can autonomously execute the first set of CTAs and delay execution of the second set of CTAs until the first set of CTAs is complete. In some embodiments the GPU may determine that a third set of CTAs is not dependent upon the first set, and may launch the third set of CTAs while the second set of CTAs is delayed. In this manner, the GPU may execute launch commands out of order with respect to the order of the launch commands in the pushbuffer.
    Type: Grant
    Filed: September 27, 2006
    Date of Patent: April 28, 2009
    Assignee: Nvidia Corporation
    Inventors: Jerome F. Duluk, Jr., Stephen D. Lew, John R. Nickolls
  • Patent number: 7508448
    Abstract: Video filtering using a programmable graphics processor is described. The programmable graphics processor may be programmed to complete a plurality of video filtering operations in a single pass through a fragment-processing pipeline within the programmable graphics processor. Video filtering functions such as deinterlacing, chroma up-sampling, scaling, and deblocking may be performed by the fragment-processing pipeline. The fragment-processing pipeline may be programmed to perform motion adaptive deinterlacing, wherein a spatially variant filter determines, on a pixel basis, whether a “bob”, a “blend”, or a “weave” operation should be used to process an interlaced image.
    Type: Grant
    Filed: May 29, 2003
    Date of Patent: March 24, 2009
    Assignee: NVIDIA Corporation
    Inventors: Stephen D. Lew, Garry W. Amann, Hassane S. Azar
  • Patent number: 7492368
    Abstract: A multiprocessor system executes parallel threads. A controller receives memory requests from the parallel threads and coalesces the memory requests to improve memory transfer efficiency.
    Type: Grant
    Filed: January 24, 2006
    Date of Patent: February 17, 2009
    Assignee: Nvidia Corporation
    Inventors: Bryon S. Nordquist, Stephen D. Lew
  • Patent number: 7466316
    Abstract: An integrated circuit includes at least two different types of processors, such as a graphics processor and a video processor. At least one operation is commonly by supported by two different types of processors. For each commonly supported operation that is scheduled, a decision is made to determine which type of processor will be selected to implement the operation.
    Type: Grant
    Filed: December 14, 2004
    Date of Patent: December 16, 2008
    Assignee: NVIDIA Corporation
    Inventors: Jonah M. Alben, Stephen D. Lew, Paolo E. Sabella