Patents by Inventor Stephen D. Lew
Stephen D. Lew has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7965895Abstract: Methods, circuits, and apparatus for reducing memory bandwidth used by a graphics processor. Uncompressed tiles are read from a display buffer portion of a graphics memory and received by an encoder. The uncompressed tiles are compressed and written back to the graphics memory. When a tile is needed again before it has been modified, the compressed version is read from memory, uncompressed, and displayed. To reduce the number of unnecessary writes of compressed tiles to memory, a tile is only written to memory if it has remained static for some number of refresh cycles. Also, to prevent a large number of compressed tiles being written to the display buffer in one refresh cycle, the encoder can be throttled after a number of tiles have been written. Validity information can be stored for use by a CRTC. If a tile is updated, the validity information is updated such that invalid compressed data is not read from memory and displayed.Type: GrantFiled: August 10, 2007Date of Patent: June 21, 2011Assignee: NVIDIA CorporationInventors: John M. Danskin, Ziyad S. Hakura, Edward L. Riegelsberger, Jason M. Musicer, Stephen D. Lew
-
Patent number: 7937567Abstract: Parallelism in a parallel processing subsystem is exploited in a scalable manner. A problem to be solved can be hierarchically decomposed into at least two levels of sub-problems. Individual threads of program execution are defined to solve the lowest-level sub-problems. The threads are grouped into one or more thread arrays, each of which solves a higher-level sub-problem. The thread arrays are executable by processing cores, each of which can execute at least one thread array at a time. Thread arrays can be grouped into grids of independent thread arrays, which solve still higher-level sub-problems or an entire problem. Thread arrays within a grid, or entire grids, can be distributed across all of the available processing cores as available in a particular system implementation.Type: GrantFiled: November 1, 2006Date of Patent: May 3, 2011Assignee: Nvidia CorporationInventors: John R. Nickolls, Stephen D. Lew
-
Publication number: 20110087860Abstract: Parallel data processing systems and methods use cooperative thread arrays (CTAs), i.e., groups of multiple threads that concurrently execute the same program on an input data set to produce an output data set. Each thread in a CTA has a unique identifier (thread ID) that can be assigned at thread launch time. The thread ID controls various aspects of the thread's processing behavior such as the portion of the input data set to be processed by each thread, the portion of an output data set to be produced by each thread, and/or sharing of intermediate results among threads. Mechanisms for loading and launching CTAs in a representative processing core and for synchronizing threads within a CTA are also described.Type: ApplicationFiled: December 17, 2010Publication date: April 14, 2011Applicant: NVIDIA CorporationInventors: John R. Nickolls, Stephen D. Lew
-
Patent number: 7916153Abstract: Embodiments of the present invention generally provide m Methods and apparatus for reducing power consumption of backlit displays are described. Power consumption is reduced by dimming backlighting by a first scale factor and boosting pixel values by a second scale factor to compensate for the dimming. The scale factors may be constant values. Alternately, one or both of the scale factors may be determined based on pixel values for one or more frames to be displayed and/or one or more frames that have been displayed. For example, scale factors may be calculated based on an average linear amplitude of one or more frames of pixel values or from a maximum pixel value of one or more frames of pixel values. A graphical processing system is described including an integrated circuit capable of transforming a pixel value from a gamma-compensated space to a linear space.Type: GrantFiled: December 12, 2007Date of Patent: March 29, 2011Assignee: NVIDIA CorporationInventors: Stephen D. Lew, Michael A. Ogrinc
-
Patent number: 7898545Abstract: An integrated circuit includes at least two different types of processors. At least one operation is supported by both types of processors, which permits a commonly supported operation to be scheduled on either processor.Type: GrantFiled: December 14, 2004Date of Patent: March 1, 2011Assignee: Nvidia CorporationInventors: Jonah M. Alben, Stephen D. Lew, Paolo E. Sabella
-
Patent number: 7876378Abstract: Video filtering using a programmable graphics processor is described. The programmable graphics processor may be programmed to complete a plurality of video filtering operations in a single pass through a fragment-processing pipeline within the programmable graphics processor. Video filtering functions such as deinterlacing, chroma up-sampling, scaling, and deblocking may be performed by the fragment-processing pipeline. The fragment-processing pipeline may be programmed to perform motion adaptive deinterlacing, wherein a spatially variant filter determines, on a pixel basis, whether a “bob”, a “blend”, or a “weave” operation should be used to process an interlaced image.Type: GrantFiled: December 14, 2007Date of Patent: January 25, 2011Assignee: NVIDIA CorporationInventors: Stephen D. Lew, Garry W. Amann, Hassane S. Azar
-
Patent number: 7861060Abstract: Parallel data processing systems and methods use cooperative thread arrays (CTAs), i.e., groups of multiple threads that concurrently execute the same program on an input data set to produce an output data set. Each thread in a CTA has a unique identifier (thread ID) that can be assigned at thread launch time. The thread ID controls various aspects of the thread's processing behavior such as the portion of the input data set to be processed by each thread, the portion of an output data set to be produced by each thread, and/or sharing of intermediate results among threads. Mechanisms for loading and launching CTAs in a representative processing core and for synchronizing threads within a CTA are also described.Type: GrantFiled: December 15, 2005Date of Patent: December 28, 2010Assignee: NVIDIA CorporationInventors: John R. Nickolls, Stephen D. Lew
-
Patent number: 7852412Abstract: Circuits, methods, and apparatus for measuring a video signal's noise level. The determination can be made based on pixel values for a single video image frame, for example, by comparing pixel color values, luminance, or other parameter for a first and second group of pixels in the frame. Each group of pixels may be part of a line in the frame, and several such measurements may be made along each line of the frame. These measurements can then be further refined depending on the measure noise level. Once a video noise level is determined, a decision on how to further process the video signal can be made. For example, the picture can be filtered or sharpened. The amount of noise filtering can be made dependent on the amount of noise measured.Type: GrantFiled: February 27, 2006Date of Patent: December 14, 2010Assignee: NVIDIA CorporationInventors: Miguel A. Guerrero, Stephen D. Lew, Gerrit A. Slavenburg
-
Patent number: 7788468Abstract: A “cooperative thread array,” or “CTA,” is a group of multiple threads that concurrently execute the same program on an input data set to produce an output data set. Each thread in a CTA has a unique thread identifier assigned at thread launch time that controls various aspects of the thread's processing behavior such as the portion of the input data set to be processed by each thread, the portion of an output data set to be produced by each thread, and/or sharing of intermediate results among threads. Different threads of the CTA are advantageously synchronized at appropriate points during CTA execution using a barrier synchronization technique in which barrier instructions in the CTA program are detected and used to suspend execution of some threads until a specified number of other threads also reaches the barrier point.Type: GrantFiled: December 15, 2005Date of Patent: August 31, 2010Assignee: NVIDIA CorporationInventors: John R. Nickolls, Stephen D. Lew, Brett W. Coon, Peter C. Mills
-
Patent number: 7779191Abstract: A system and method for transitions a computing system between operating modes that have different power consumption characteristics. When a system management unit (SMU) determines that the computing system is in a low activity state, the SMU transitions the central processing unit (CPU) into a low power operating mode after the CPU stores critical operating state of the CPU in a memory. The SMU then intercepts and processes interrupts intended for the CPU, modifying a copy of the critical operating state. This effectively extends the time during which the CPU stays in lower power mode. When the SMU determines that the computing system exits a low activity state, the copy of the critical operating state is stored in the memory and the SMU transitions the CPU into a high power operating mode using the modified critical operating state.Type: GrantFiled: July 29, 2008Date of Patent: August 17, 2010Assignee: NVIDIA CorporationInventors: Chien-Ping Lu, Stephen D. Lew, Robert William Chapman
-
Patent number: 7733419Abstract: Video filtering using a programmable graphics processor is described. The programmable graphics processor may be programmed to complete a plurality of video filtering operations in a single pass through a fragment-processing pipeline within the programmable graphics processor. Video filtering functions such as deinterlacing, chroma up-sampling, scaling, and deblocking may be performed by the fragment-processing pipeline. The fragment-processing pipeline may be programmed to perform motion adaptive deinterlacing, wherein a spatially variant filter determines, on a pixel basis, whether a “bob”, a “blend”, or a “weave” operation should be used to process an interlaced image.Type: GrantFiled: December 14, 2007Date of Patent: June 8, 2010Assignee: Nvidia CorporationInventors: Stephen D. Lew, Garry W. Amann, Hassane S. Azar
-
Patent number: 7705915Abstract: Video filtering using a programmable graphics processor is described. The programmable graphics processor may be programmed to complete a plurality of video filtering operations in a single pass through a fragment-processing pipeline within the programmable graphics processor. Video filtering functions such as deinterlacing, chroma up-sampling, scaling, and deblocking may be performed by the fragment-processing pipeline. The fragment-processing pipeline may be programmed to perform motion adaptive deinterlacing, wherein a spatially variant filter determines, on a pixel basis, whether a “bob”, a “blend”, or a “weave” operation should be used to process an interlaced image.Type: GrantFiled: December 14, 2007Date of Patent: April 27, 2010Assignee: NVIDIA CorporationInventors: Stephen D. Lew, Garry W. Amann, Hassane S. Azar
-
Publication number: 20100031071Abstract: A system and method for transitions a computing system between operating modes that have different power consumption characteristics. When a system management unit (SMU) determines that the computing system is in a low activity state, the SMU transitions the central processing unit (CPU) into a low power operating mode after the CPU stores critical operating state of the CPU in a memory. The SMU then intercepts and processes interrupts intended for the CPU, modifying a copy of the critical operating state. This effectively extends the time during which the CPU stays in lower power mode. When the SMU determines that the computing system exits a low activity state, the copy of the critical operating state is stored in the memory and the SMU transitions the CPU into a high power operating mode using the modified critical operating state.Type: ApplicationFiled: July 29, 2008Publication date: February 4, 2010Inventors: Chien-Ping LU, Stephen D. Lew, Robert William Chapman
-
Patent number: 7624224Abstract: A system, method, and computer program product are provided for directly executing code in block-based memory, which resides in communication with a processor and a controller. Utilizing the controller, a request is received from the processor for a subset of a block of data in the block-based memory, and at least a portion of the block is retrieved from the block-based memory. After the retrieval, at least a portion of the block is stored in a cache. The subset of the block is then transmitted to the processor, utilizing the controller. To this end, code in the block-based memory is directly executed.Type: GrantFiled: December 13, 2005Date of Patent: November 24, 2009Assignee: NVIDIA CorporationInventors: Shang-Tse Chuang, Stephen D. Lew, Gerrit A. Slavenburg
-
Patent number: 7619687Abstract: Video filtering using a programmable graphics processor is described. The programmable graphics processor may be programmed to complete a plurality of video filtering operations in a single pass through a fragment-processing pipeline within the programmable graphics processor. Video filtering functions such as deinterlacing, chroma up-sampling, scaling, and deblocking may be performed by the fragment-processing pipeline. The fragment-processing pipeline may be programmed to perform motion adaptive deinterlacing, wherein a spatially variant filter determines, on a pixel basis, whether a “bob”, a “blend”, or a “weave” operation should be used to process an interlaced image.Type: GrantFiled: December 14, 2007Date of Patent: November 17, 2009Assignee: NVIDIA CorporationInventors: Stephen D. Lew, Garry W. Amann, Hassane S. Azar
-
Patent number: 7577762Abstract: A system and method schedules command streams for processing by a variety of consumers. A single command stream is parsed and commands included in the command stream are output to one of the variety of consumers at a time. A pre-emptive scheduling mechanism is used so that a first consumer may yield to a second consumer when the first consumer has received a sufficient amount of commands. The pre-emptive scheduling enables several of the consumers to process commands concurrently. The pre-emptive scheduling mechanism may be implemented by a device driver inserting yield commands into the command stream or by a unit parsing the command stream.Type: GrantFiled: February 1, 2005Date of Patent: August 18, 2009Assignee: NVIDIA CorporationInventors: Lincoln G. Garlick, Scott R. Whitman, Stephen D. Lew
-
Patent number: 7526634Abstract: Systems and methods for synchronizing processing work performed by threads, cooperative thread arrays (CTAs), or “sets” of CTAs. A central processing unit can load launch commands for a first set of CTAs and a second set of CTAs in a pushbuffer, and specify a dependency of the second set upon completion of execution of the first set. A parallel or graphics processor (GPU) can autonomously execute the first set of CTAs and delay execution of the second set of CTAs until the first set of CTAs is complete. In some embodiments the GPU may determine that a third set of CTAs is not dependent upon the first set, and may launch the third set of CTAs while the second set of CTAs is delayed. In this manner, the GPU may execute launch commands out of order with respect to the order of the launch commands in the pushbuffer.Type: GrantFiled: September 27, 2006Date of Patent: April 28, 2009Assignee: Nvidia CorporationInventors: Jerome F. Duluk, Jr., Stephen D. Lew, John R. Nickolls
-
Patent number: 7508448Abstract: Video filtering using a programmable graphics processor is described. The programmable graphics processor may be programmed to complete a plurality of video filtering operations in a single pass through a fragment-processing pipeline within the programmable graphics processor. Video filtering functions such as deinterlacing, chroma up-sampling, scaling, and deblocking may be performed by the fragment-processing pipeline. The fragment-processing pipeline may be programmed to perform motion adaptive deinterlacing, wherein a spatially variant filter determines, on a pixel basis, whether a “bob”, a “blend”, or a “weave” operation should be used to process an interlaced image.Type: GrantFiled: May 29, 2003Date of Patent: March 24, 2009Assignee: NVIDIA CorporationInventors: Stephen D. Lew, Garry W. Amann, Hassane S. Azar
-
Patent number: 7492368Abstract: A multiprocessor system executes parallel threads. A controller receives memory requests from the parallel threads and coalesces the memory requests to improve memory transfer efficiency.Type: GrantFiled: January 24, 2006Date of Patent: February 17, 2009Assignee: Nvidia CorporationInventors: Bryon S. Nordquist, Stephen D. Lew
-
Patent number: 7466316Abstract: An integrated circuit includes at least two different types of processors, such as a graphics processor and a video processor. At least one operation is commonly by supported by two different types of processors. For each commonly supported operation that is scheduled, a decision is made to determine which type of processor will be selected to implement the operation.Type: GrantFiled: December 14, 2004Date of Patent: December 16, 2008Assignee: NVIDIA CorporationInventors: Jonah M. Alben, Stephen D. Lew, Paolo E. Sabella