Patents by Inventor David C. Tannenbaum
David C. Tannenbaum has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11798218Abstract: A method of packing coverage in a graphics processing unit (GPU) may include receiving an indication for a portion of an image, determining, based on the indication, a packing technique for the portion of the image, and packing coverage for the portion of the image based on the packing technique. The indication may include one or more of: an importance, a quality, a level of interest, a level of detail, or a variable-rate shading (VRS) level. The indication may be received from an application. The packing technique may include array merging. The array merging may include quad merging. The packing technique may include pixel piling. The packing technique may be a first packing technique, and the method may further include determining, based on the indication, a second packing technique for the portion of the image, and packing coverage for the portion of the image based on the second packing technique.Type: GrantFiled: October 15, 2021Date of Patent: October 24, 2023Inventors: Keshavan Varadarajan, Veynu Narasiman, David C. Tannenbaum
-
Patent number: 11763521Abstract: A system and a method are disclosed for varying a pixel-rate functionality of a GPU as an optional feature without an explicit implementation from within an application. User interface (UI) content may be detected in a draw call of an application and a variable-rate shader lookup map may be generated based on the detected UI content. A pixel rate of 3D content may be increased using the variable-rate shader lookup map. Additionally or alternatively, other conditions may be detected for increasing the pixel rate, such as using information in an application profile, detecting high or low luminance values, detecting motion and/or detecting temporal anti-aliasing.Type: GrantFiled: October 6, 2021Date of Patent: September 19, 2023Inventors: Gabriel T. Dagani, Gregory Bergschneider, David C. Tannenbaum
-
Patent number: 11748933Abstract: A GPU includes shader cores and a shader warp packer unit. The shader warp packer unit may receive a first primitive associated with a first partially covered quad, and a second primitive associated with a second partially covered quad. The shader warp packer unit may determine that the first partially covered quad and the second partially covered quad have non-overlapping coverage. The shader warp packer unit may pack the first partially covered quad and the second partially covered quad into a packed quad. The shader warp packer unit may send the packed quad to the shader cores. The first partially covered quad and the second partially covered quad may be spatially disjoint from each other. The shader cores may receive and process the packed quad with no loss of information relative to the shader cores individually processing the first partially covered quad and the second partially covered quad.Type: GrantFiled: February 4, 2021Date of Patent: September 5, 2023Inventors: Keshavan Varadarajan, David C. Tannenbaum, F N U Gurupad
-
Patent number: 11715252Abstract: A GPU includes shader cores and a shader warp packer unit. The shader warp packer unit may receive a first primitive associated with a first partially covered quad, and a second primitive associated with a second partially covered quad. The shader warp packer unit may determine that the first partially covered quad and the second partially covered quad have non-overlapping coverage. The shader warp packer unit may pack the first partially covered quad and the second partially covered quad into a packed quad. The shader warp packer unit may send the packed quad to the shader cores. The first partially covered quad and the second partially covered quad may be spatially disjoint from each other. The shader cores may receive and process the packed quad with no loss of information relative to the shader cores individually processing the first partially covered quad and the second partially covered quad.Type: GrantFiled: February 4, 2021Date of Patent: August 1, 2023Inventors: Keshavan Varadarajan, David C. Tannenbaum, F N U Gurupad
-
Patent number: 11620222Abstract: A method for performing an atomic memory operation may include receiving an atomic input, receiving an address for an atomic memory location, and performing an atomic operation on the atomic memory location based on the atomic input, wherein performing the atomic operation may include performing a first operation on a first portion of the atomic input, and performing a second operation, which may be different from the first operation, on a second portion of the atomic input. The method may further include storing a result of the first operation in a first portion of the atomic memory location, and storing a result of the second operation in a second portion of the atomic memory location. The method may further include returning an original content of the first portion of the atomic memory location concatenated with an original content of the second portion of the atomic memory location.Type: GrantFiled: October 30, 2020Date of Patent: April 4, 2023Inventors: David C. Tannenbaum, Raun M. Krisch, Christopher P. Frascati
-
Patent number: 11610281Abstract: A method of processing a workload in a graphics processing unit (GPU) may include detecting a work item of the workload in the GPU, determining a cache policy for the work item, and operating at least a portion of a cache memory hierarchy in the GPU for at least a portion of the work item based on the cache policy. The work item may be detected based on information received from an application and/or monitoring one or more performance counters by a driver and/or hardware detection logic. The method may further include monitoring one or more performance counters, wherein the cache policy for the work item may be determined and/or changed based on the one or more performance counters. The cache policy for the work item may be selected based on a runtime learning model.Type: GrantFiled: January 11, 2021Date of Patent: March 21, 2023Inventors: Sushant Kondguli, Arun Radhakrishnan, Zachary D. Neyland, David C. Tannenbaum
-
Publication number: 20230052075Abstract: A system and a method are disclosed for varying a pixel-rate functionality of a GPU as an optional feature without an explicit implementation from within an application. User interface (UI) content may be detected in a draw call of an application and a variable-rate shader lookup map may be generated based on the detected UI content. A pixel rate of 3D content may be increased using the variable-rate shader lookup map. Additionally or alternatively, other conditions may be detected for increasing the pixel rate, such as using information in an application profile, detecting high or low luminance values, detecting motion and/or detecting temporal anti-aliasing.Type: ApplicationFiled: October 6, 2021Publication date: February 16, 2023Inventors: Gabriel T. DAGANI, Gregory BERGSCHNEIDER, David C. TANNENBAUM
-
Publication number: 20220301233Abstract: A hybrid ray tracing system includes: a processor; and memory including instructions that, when executed by the processor, cause the processor to: identify a subset of pixels of an image to be ray-traced based on variable rate shading (VRS) screenspace image data; set, based on the VRS screenspace image data, one or more material properties of at least one object corresponding to the subset of pixels; and perform ray-tracing for the subset of pixels to generate a ray-traced image. The ray-tracing includes performing a limited ray casting process based on the set one or more material properties.Type: ApplicationFiled: January 14, 2022Publication date: September 22, 2022Inventors: Keshavan Varadarajan, David C. Tannenbaum
-
Patent number: 11416960Abstract: A binning subsystem of a GPU includes a storage subsystem, a shader core to output first data via a first path, a selector to receive the first data via the first path, and to receive second data from the storage subsystem via a second path. The storage subsystem includes a binner unit and a control logic unit. The control logic unit causes the selector to transfer the first data or the second data to the binner unit. The binner unit may transfer binner output data to the shader core via a third path. The binner unit may transfer the binner output data to one or more subsequent stages of a graphics pipeline via a fourth path. The binner unit may transfer the binner output data to the storage subsystem via a fifth path. The control logic unit may control the binner unit such that the binner unit can be used for general purpose computation.Type: GrantFiled: December 2, 2020Date of Patent: August 16, 2022Inventors: David C. Tannenbaum, Keshavan Varadarajan, Veynu Narasiman
-
Publication number: 20220206737Abstract: A system and method is disclosed that allows multiple casting devices to work together to populate a large display screen according to the subject matter disclosed herein. The system includes a receiving device that includes two or more screen-cast receivers and a controller. Each screen-cast receiver receives from a corresponding casting device at least a portion of a frame of original content of the corresponding casting device generated in a native resolution of the corresponding casting device. The controller synchronizes each received portion of the frame of the original content of the corresponding casting device to form a video output signal that comprises a combination of each received portion, in addition to any internally generated content derived by the receiving display. A casting device may be a smartphone, a tablet, or a computing device, such as a laptop computer.Type: ApplicationFiled: March 22, 2021Publication date: June 30, 2022Inventors: Gabriel T. DAGANI, David C. TANNENBAUM, Christopher P. FRASCATI, Michael PHILLIP
-
Publication number: 20220197976Abstract: A graphics processing unit (GPU) and a method is disclosed that performs a convolution operation recast as a matrix multiplication operation. The GPU includes a register file, a processor and a state machine. The register file stores data of an input feature map and data of a filter weight kernel. The processor performs a convolution operation on data of the input feature map and data of the filter weight kernel as a matrix multiplication operation. The state machine facilitates performance of the convolution operation by unrolling the data of the input feature map and the data of the filter weight kernel in the register file. The state machine includes control registers that determine movement of data through the register file to perform the matrix multiplication operation on the data in the register file in an unrolled manner.Type: ApplicationFiled: February 10, 2021Publication date: June 23, 2022Inventors: Christopher P. FRASCATI, Simon WATERS, Rama S.B HARIHARA, David C. TANNENBAUM
-
Patent number: 11360732Abstract: A system and method is disclosed that allows multiple casting devices to work together to populate a large display screen according to the subject matter disclosed herein. The system includes a receiving device that includes two or more screen-cast receivers and a controller. Each screen-cast receiver receives from a corresponding casting device at least a portion of a frame of original content of the corresponding casting device generated in a native resolution of the corresponding casting device. The controller synchronizes each received portion of the frame of the original content of the corresponding casting device to form a video output signal that comprises a combination of each received portion, in addition to any internally generated content derived by the receiving display. A casting device may be a smartphone, a tablet, or a computing device, such as a laptop computer.Type: GrantFiled: March 22, 2021Date of Patent: June 14, 2022Inventors: Gabriel T. Dagani, David C. Tannenbaum, Christopher P. Frascati, Michael Phillip
-
Publication number: 20220148122Abstract: A binning subsystem of a GPU includes a storage subsystem, a shader core to output first data via a first path, a selector to receive the first data via the first path, and to receive second data from the storage subsystem via a second path. The storage subsystem includes a binner unit and a control logic unit. The control logic unit causes the selector to transfer the first data or the second data to the binner unit. The binner unit may transfer binner output data to the shader core via a third path. The binner unit may transfer the binner output data to one or more subsequent stages of a graphics pipeline via a fourth path. The binner unit may transfer the binner output data to the storage subsystem via a fifth path. The control logic unit may control the binner unit such that the binner unit can be used for general purpose computation.Type: ApplicationFiled: December 2, 2020Publication date: May 12, 2022Inventors: David C. TANNENBAUM, Keshavan VARADARAJAN, Veynu NARASIMAN
-
Patent number: 11321907Abstract: A system and a method are disclosed that optimizes a graphics driver. The system may be embodied as a computing device that includes a storage that is internal to the computing device, a graphic processing unit that includes a driver and a controller. The controller may be configured to run a daemon process that optimizes a shader and/or a shader pipeline for an application that is resident on the computing device when the computing device is not running the application and stores at least one optimization for the shader in the storage. The at least one optimization may be based on the application. The daemon process may further receive a request from the driver of the GPU for an optimization for the shader/shader pipeline during a runtime compilation of the shader and provide the at least one optimization to the driver of the GPU from the storage.Type: GrantFiled: April 9, 2021Date of Patent: May 3, 2022Inventors: Gabriel T. Dagani, Raun M. Krisch, Zachary Neyland, Robert Metzger, David C. Tannenbaum
-
Publication number: 20220067876Abstract: A method of processing a workload in a graphics processing unit (GPU) may include detecting a work item of the workload in the GPU, determining a cache policy for the work item, and operating at least a portion of a cache memory hierarchy in the GPU for at least a portion of the work item based on the cache policy. The work item may be detected based on information received from an application and/or monitoring one or more performance counters by a driver and/or hardware detection logic. The method may further include monitoring one or more performance counters, wherein the cache policy for the work item may be determined and/or changed based on the one or more performance counters. The cache policy for the work item may be selected based on a runtime learning model.Type: ApplicationFiled: January 11, 2021Publication date: March 3, 2022Inventors: Sushant KONDGULI, Arun RADHAKRISHNAN, Zachary D. NEYLAND, David C. TANNENBAUM
-
Publication number: 20220066934Abstract: A method for performing an atomic memory operation may include receiving an atomic input, receiving an address for an atomic memory location, and performing an atomic operation on the atomic memory location based on the atomic input, wherein performing the atomic operation may include performing a first operation on a first portion of the atomic input, and performing a second operation, which may be different from the first operation, on a second portion of the atomic input. The method may further include storing a result of the first operation in a first portion of the atomic memory location, and storing a result of the second operation in a second portion of the atomic memory location. The method may further include returning an original content of the first portion of the atomic memory location concatenated with an original content of the second portion of the atomic memory location.Type: ApplicationFiled: October 30, 2020Publication date: March 3, 2022Inventors: David C. TANNENBAUM, Raun M. KRISCH, Christopher P. FRASCATI
-
Publication number: 20220036632Abstract: A GPU includes one or more post-processing controllers, and a 3D graphics pipeline including a post-processing shader stage following a pixel shader stage. The one or more post-processing controllers may synchronize an execution of one or more post-processing stages including the post-processing shader stage. The 3D pipeline may include one or more pixel shaders, one or more tile buffers, and a direct communication link between the post-processing shader stage and the one or more tile buffers. The one or more post-processing controllers may synchronize communication between the one or more post-processing shaders and the one or more tile buffers.Type: ApplicationFiled: February 26, 2021Publication date: February 3, 2022Inventors: Raun M. KRISCH, David C. TANNENBAUM, Moumine BALLO, Keshavan VARADARAJAN
-
Publication number: 20220036631Abstract: A GPU includes shader cores and a shader warp packer unit. The shader warp packer unit may receive a first primitive associated with a first partially covered quad, and a second primitive associated with a second partially covered quad. The shader warp packer unit may determine that the first partially covered quad and the second partially covered quad have non-overlapping coverage. The shader warp packer unit may pack the first partially covered quad and the second partially covered quad into a packed quad. The shader warp packer unit may send the packed quad to the shader cores. The first partially covered quad and the second partially covered quad may be spatially disjoint from each other. The shader cores may receive and process the packed quad with no loss of information relative to the shader cores individually processing the first partially covered quad and the second partially covered quad.Type: ApplicationFiled: February 4, 2021Publication date: February 3, 2022Inventors: Keshavan VARADARAJAN, David C. TANNENBAUM, FNU GURUPAD
-
Publication number: 20220036634Abstract: A method of packing coverage in a graphics processing unit (GPU) may include receiving an indication for a portion of an image, determining, based on the indication, a packing technique for the portion of the image, and packing coverage for the portion of the image based on the packing technique. The indication may include one or more of: an importance, a quality, a level of interest, a level of detail, or a variable-rate shading (VRS) level. The indication may be received from an application. The packing technique may include array merging. The array merging may include quad merging. The packing technique may include pixel piling. The packing technique may be a first packing technique, and the method may further include determining, based on the indication, a second packing technique for the portion of the image, and packing coverage for the portion of the image based on the second packing technique.Type: ApplicationFiled: October 15, 2021Publication date: February 3, 2022Inventors: Keshavan VARADARAJAN, Veynu NARASIMAN, David C. TANNENBAUM
-
Publication number: 20210358191Abstract: A GPU is disclosed, which may include a VRS interface to provide spatial information and/or primitive-specific information. The GPU may include one or more shader cores including a control logic section to determine a shading precision value based on the spatial information and/or the primitive-specific information. The control logic section may modulate a shading precision according to the shading precision value. A method for controlling shading precision by a GPU may include providing, by a VRS interface, the spatial information and/or primitive-specific information. The method may include determining, by a control logic section, a shading precision value based on the spatial information and/or the primitive-specific information. The method may include modulating a shading precision according to the shading precision value.Type: ApplicationFiled: November 20, 2020Publication date: November 18, 2021Inventors: Christopher P. FRASCATI, Raun M. KRISCH, Derek J. LENTZ, David C. TANNENBAUM