Patents by Inventor Urvang Joshi

Urvang Joshi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Super-resolution loop restoration

Patent number: 12075081

Abstract: A super-resolution coding mode is described. An encoded image can be decoded from an encoded bitstream stored on a non-transitory computer-readable storage medium. A flag can indicate whether an image was encoded using the super-resolution mode at a first resolution. Responsive to the flag indicating that the image was encoded using the super-resolution mode, bits indicating an amount of scaling of the image are included. The image is decoded from the encoded bitstream to obtain a reconstructed image at the first resolution, and the reconstructed image is upscaled to a second resolution using the amount of scaling to obtain an upscaled reconstructed image. The second resolution is higher than the first resolution. Loop restoration parameters within the bitstream can used for look restoration filtering of the upscaled reconstructed image to obtain a loop restored image at the second resolution.

Type: Grant

Filed: January 17, 2023

Date of Patent: August 27, 2024

Assignee: GOOGLE LLC

Inventors: Urvang Joshi, Debargha Mukherjee, Andrew Simpson
Video Coding With Guided Machine Learning Restoration

Publication number: 20240098280

Abstract: Image coding using guided machine learning restoration may include obtaining reconstructed frame data by decoding, obtaining a restored frame by restoring the reconstructed frame, and outputting the restored frame. Obtaining the restored frame may include obtaining a reconstructed block, obtaining guide parameter values, obtaining a restored block, and including the restored block in the restored frame. Obtaining the restored block may include inputting the reconstructed block to an input layer of a trained guided convolutional neural network, wherein the neural network is constrained such that an output layer has a defined cardinality of channels, obtaining, from the output layer, neural network output channel predictions, obtaining a guided neural network prediction as a linear combination of the guide parameter values and the neural network output channel predictions, and generating the restored block using the guided neural network prediction.

Type: Application

Filed: January 19, 2021

Publication date: March 21, 2024

Inventors: Urvang Joshi, Yue Chen, Sarah Parker, Elliott Karpilovsky, Debargha Mukherjee
Transforms for large video and image blocks

Patent number: 11870993

Abstract: Improved transforms are used to encode and decode large video and image blocks. During encoding, a prediction residual block having a large size (e.g., larger than 32×32) is generated. The pixel values of the prediction residual block are transformed to produce transform coefficients. After determining that the transform coefficients exceed a threshold cardinality representative of a maximum transform block size (e.g., 32×32), a number of the transform coefficients are discarded such that a remaining number of transform coefficients does not exceed the threshold cardinality. A transform block is then generated using the remaining number. During decoding, after determining that the transform coefficients exceed the threshold cardinality, a number of new coefficients are added to the transform coefficients such that a total number of transform coefficients exceeds the threshold cardinality. The transform coefficients are then inverse transformed into a prediction residual block having a large size.

Type: Grant

Filed: June 28, 2021

Date of Patent: January 9, 2024

Assignee: GOOGLE LLC

Inventors: Urvang Joshi, Debargha Mukherjee
Inter-Intra Prediction With Implicit Models

Publication number: 20230291925

Abstract: Video coding in accordance with an inter-intra prediction model may include coding an inter-prediction motion vector for a current block of a current frame, obtaining spatial block-context pixels oriented relative to the current block, generating an inter-prediction block, generating a corresponding set of reference block-context pixels oriented relative to the inter-prediction block, identifying inter-intra prediction parameters that correspond with minimizing error between the spatial block-context pixels and the reference block-context pixels, generating a prediction block for the current block by, for a current pixel of the current block, obtaining an inter-prediction pixel, determining a predictor for the current pixel using a combination of the inter-prediction pixel and the inter-intra prediction parameters, and including the predictor in the prediction block.

Type: Application

Filed: July 1, 2020

Publication date: September 14, 2023

Applicant: Google LLC

Inventors: Debargha Mukherjee, Yue Chen, Urvang Joshi, Sarah Parker, Elliott Karpilovsky, Hui Su
Hybrid motion-compensated neural network with side-information based video coding

Patent number: 11689726

Abstract: A hybrid apparatus for coding a video stream includes a first encoder. The first encoder includes a neural network having at least one hidden layer, and the neural network receives source data from the video stream at a first hidden layer of the at least one hidden layer, receives side information correlated with the source data at the first hidden layer, and generates guided information using the source data and the side information. The first encoder outputs the guided information and the side information for a decoder to reconstruct the source data.

Type: Grant

Filed: July 19, 2019

Date of Patent: June 27, 2023

Assignee: GOOGLE LLC

Inventors: Debargha Mukherjee, Urvang Joshi, Yue Chen, Sarah Parker
IMAGE AND VIDEO CODING USING MACHINE LEARNING PREDICTION CODING MODELS

Publication number: 20230199179

Abstract: Video coding may include generating, by a processor, a decoded frame by decoding a current frame from an encoded bitstream and outputting a reconstructed frame based on the decoded frame. Decoding includes identifying a current encoded block from the current frame, identifying a prediction coding model for the current block, wherein the prediction coding model is a machine learning prediction coding model from a plurality of machine learning prediction coding models, identifying reference values for decoding the current block based on the prediction coding model, obtaining prediction values based on the prediction coding model and the reference values, generating a decoded block corresponding to the current encoded block based on the prediction values, and including the decoded block in the decoded frame.

Type: Application

Filed: February 23, 2023

Publication date: June 22, 2023

Inventors: Debargha Mukherjee, Urvang Joshi, Yue Chen, Sarah Parker
SUPER-RESOLUTION LOOP RESTORATION

Publication number: 20230179789

Abstract: A super-resolution coding mode is described. An encoded image can be decoded from an encoded bitstream stored on a non-transitory computer-readable storage medium. A flag can indicate whether an image was encoded using the super-resolution mode at a first resolution. Responsive to the flag indicating that the image was encoded using the super-resolution mode, bits indicating an amount of scaling of the image are included. The image is decoded from the encoded bitstream to obtain a reconstructed image at the first resolution, and the reconstructed image is upscaled to a second resolution using the amount of scaling to obtain an upscaled reconstructed image. The second resolution is higher than the first resolution. Loop restoration parameters within the bitstream can used for look restoration filtering of the upscaled reconstructed image to obtain a loop restored image at the second resolution.

Type: Application

Filed: January 17, 2023

Publication date: June 8, 2023

Inventors: Urvang Joshi, Debargha Mukherjee, Andrew Simpson
Image and video coding using machine learning prediction coding models

Patent number: 11601644

Abstract: Video coding may include generating, by a processor, a decoded frame by decoding a current frame from an encoded bitstream and outputting a reconstructed frame based on the decoded frame. Decoding includes identifying a current encoded block from the current frame, identifying a prediction coding model for the current block, wherein the prediction coding model is a machine learning prediction coding model from a plurality of machine learning prediction coding models, identifying reference values for decoding the current block based on the prediction coding model, obtaining prediction values based on the prediction coding model and the reference values, generating a decoded block corresponding to the current encoded block based on the prediction values, and including the decoded block in the decoded frame.

Type: Grant

Filed: March 7, 2019

Date of Patent: March 7, 2023

Assignee: GOOGLE LLC

Inventors: Debargha Mukherjee, Urvang Joshi, Yue Chen, Sarah Parker
Super-resolution loop restoration

Patent number: 11558631

Abstract: A super-resolution coding mode is described. Encoded image can be decoded by decoding, from an encoded bitstream, a flag indicating whether an image was encoded using the super-resolution mode. The image is encoded at a first resolution. Responsive to the flag indicating that the image was encoded using the super-resolution mode, bits indicating an amount of scaling of the image are decoded. The image is decoded from the encoded bitstream to obtain a reconstructed image at the first resolution, and the reconstructed image is upscaled to a second resolution using the amount of scaling to obtain an upscaled reconstructed image. The second resolution is higher than the first resolution. Loop restoration filtering is applied to the upscaled reconstructed image using loop restoration parameters to obtain a loop restored image at the second resolution.

Type: Grant

Filed: March 31, 2020

Date of Patent: January 17, 2023

Assignee: GOOGLE LLC

Inventors: Urvang Joshi, Debargha Mukherjee, Andrew Simpson
Extended Transform Partitions for Video Compression

Publication number: 20220345704

Abstract: Transform-level partitioning of a prediction residual block is performed to improve compression efficiency of video data. During encoding, a prediction residual block is generated responsive to prediction-level partitioning performed against a video block, a transform block partition type to use is determined based on the prediction residual block, a non-recursive transform-level partitioning is performed against the prediction residual block according to the transform block partition type, and transform blocks generated as a result of the transform-level partitioning are encoded to a bitstream.

Type: Application

Filed: July 8, 2022

Publication date: October 27, 2022

Inventors: Sarah Parker, Debargha Mukherjee, Yue Chen, Elliott Karpilovsky, Urvang Joshi
Extended transform partitions for video compression

Patent number: 11388401

Abstract: Transform-level partitioning of a prediction residual block is performed to improve compression efficiency of video data. During encoding, a prediction residual block is generated responsive to prediction-level partitioning performed against a video block, a transform block partition type to use is determined based on the prediction residual block, a non-recursive transform-level partitioning is performed against the prediction residual block according to the transform block partition type, and transform blocks generated as a result of the transform-level partitioning are encoded to a bitstream.

Type: Grant

Filed: June 26, 2020

Date of Patent: July 12, 2022

Assignee: GOOGLE LLC

Inventors: Sarah Parker, Debargha Mukherjee, Yue Chen, Elliott Karpilovsky, Urvang Joshi
GUIDED RESTORATION OF VIDEO DATA USING NEURAL NETWORKS

Publication number: 20220207654

Abstract: Guided restoration is used to restore video data degraded from a video frame. The video frame is divided into restoration units (RUs) which each correspond to one or more blocks of the video frame. Restoration schemes are selected for each RU. The restoration schemes may indicate to use one of a plurality of neural networks trained for the guided restoration. Alternatively, the restoration schemes may indicate to use a neural network and a filter-based restoration tool. The video frame is then restored by processing each RU according to the respective selected restoration scheme. During encoding, the restored video frame is encoded to an output bitstream, and the use of the selected restoration schemes may be signaled within the output bitstream. During decoding, the restored video frame is output to an output video stream.

Type: Application

Filed: March 18, 2022

Publication date: June 30, 2022

Inventors: Debargha Mukherjee, Urvang Joshi, Yue Chen, Sarah Parker
Guided restoration of video data using neural networks

Patent number: 11282172

Abstract: Guided restoration is used to restore video data degraded from a video frame. The video frame is divided into restoration units (RUs) which each correspond to one or more blocks of the video frame. Restoration schemes are selected for each RU. The restoration schemes may indicate to use one of a plurality of neural networks trained for the guided restoration. Alternatively, the restoration schemes may indicate to use a neural network and a filter-based restoration tool. The video frame is then restored by processing each RU according to the respective selected restoration scheme. During encoding, the restored video frame is encoded to an output bitstream, and the use of the selected restoration schemes may be signaled within the output bitstream. During decoding, the restored video frame is output to an output video stream.

Type: Grant

Filed: July 18, 2019

Date of Patent: March 22, 2022

Assignee: GOOGLE LLC

Inventors: Debargha Mukherjee, Urvang Joshi, Yue Chen, Sarah Parker
EXTENDED TRANSFORM PARTITIONS FOR VIDEO COMPRESSION

Publication number: 20210409705

Abstract: Transform-level partitioning of a prediction residual block is performed to improve compression efficiency of video data. During encoding, a prediction residual block is generated responsive to prediction-level partitioning performed against a video block, a transform block partition type to use is determined based on the prediction residual block, a non-recursive transform-level partitioning is performed against the prediction residual block according to the transform block partition type, and transform blocks generated as a result of the transform-level partitioning are encoded to a bitstream.

Type: Application

Filed: June 26, 2020

Publication date: December 30, 2021

Inventors: Sarah Parker, Debargha Mukherjee, Yue Chen, Elliott Karpilovsky, Urvang Joshi
TRANSFORMS FOR LARGE VIDEO AND IMAGE BLOCKS

Publication number: 20210329245

Abstract: Improved transforms are used to encode and decode large video and image blocks. During encoding, a prediction residual block having a large size (e.g., larger than 32×32) is generated. The pixel values of the prediction residual block are transformed to produce transform coefficients. After determining that the transform coefficients exceed a threshold cardinality representative of a maximum transform block size (e.g., 32×32), a number of the transform coefficients are discarded such that a remaining number of transform coefficients does not exceed the threshold cardinality. A transform block is then generated using the remaining number. During decoding, after determining that the transform coefficients exceed the threshold cardinality, a number of new coefficients are added to the transform coefficients such that a total number of transform coefficients exceeds the threshold cardinality. The transform coefficients are then inverse transformed into a prediction residual block having a large size.

Type: Application

Filed: June 28, 2021

Publication date: October 21, 2021

Inventors: Urvang Joshi, Debargha Mukherjee
Transforms for large video and image blocks

Patent number: 11051018

Abstract: Improved transforms are used to encode and decode large video and image blocks. During encoding, a prediction residual block having a large size (e.g., larger than 32×32) is generated. The pixel values of the prediction residual block are transformed to produce transform coefficients. After determining that the transform coefficients exceed a threshold cardinality representative of a maximum transform block size (e.g., 32×32), a number of the transform coefficients are discarded such that a remaining number of transform coefficients does not exceed the threshold cardinality. A transform block is then generated using the remaining number. During decoding, after determining that the transform coefficients exceed the threshold cardinality, a number of new coefficients are added to the transform coefficients such that a total number of transform coefficients exceeds the threshold cardinality. The transform coefficients are then inverse transformed into a prediction residual block having a large size.

Type: Grant

Filed: September 4, 2020

Date of Patent: June 29, 2021

Assignee: GOOGLE LLC

Inventors: Urvang Joshi, Debargha Mukherjee
Intra-prediction for smooth blocks in image/video

Patent number: 11039131

Abstract: An apparatus for coding a block of a frame using intra-prediction includes a memory and a processor. The processor is configured to execute instructions stored in the memory to obtain an intra-prediction mode for coding the block of the frame; select a transform type for coding a transform block of a residual block, which results from predicting the block using the intra-prediction mode; and code the transform block using the transform type. To select the transform type includes to, in a case where the intra-prediction mode is a SMOOTH_PRED, select a ADST_ADST transform type; in a case where the intra-prediction mode is a SMOOTH_H_PRED, select a DCT_ADST transform type; and in a case where the intra-prediction mode is a SMOOTH_V_PRED, select a ADST_DCT transform type.

Type: Grant

Filed: March 27, 2020

Date of Patent: June 15, 2021

Assignee: GOOGLE LLC

Inventors: Urvang Joshi, Debargha Mukherjee
TRANSFORMS FOR LARGE VIDEO AND IMAGE BLOCKS

Publication number: 20200404273

Abstract: Improved transforms are used to encode and decode large video and image blocks. During encoding, a prediction residual block having a large size (e.g., larger than 32×32) is generated. The pixel values of the prediction residual block are transformed to produce transform coefficients. After determining that the transform coefficients exceed a threshold cardinality representative of a maximum transform block size (e.g., 32×32), a number of the transform coefficients are discarded such that a remaining number of transform coefficients does not exceed the threshold cardinality. A transform block is then generated using the remaining number. During decoding, after determining that the transform coefficients exceed the threshold cardinality, a number of new coefficients are added to the transform coefficients such that a total number of transform coefficients exceeds the threshold cardinality. The transform coefficients are then inverse transformed into a prediction residual block having a large size.

Type: Application

Filed: September 4, 2020

Publication date: December 24, 2020

Inventors: Urvang Joshi, Debargha Mukherjee
Rate/distortion/RDcost modeling with machine learning

Patent number: 10848765

Abstract: A method for encoding a block of a video stream includes generating, using pixel values of the block, block features for the block; for each candidate encoding mode of candidate encoding modes, generating, using the block features and the each candidate encoding mode as inputs to a machine-learning module, a respective encoding cost; selecting, based on the respective encoding costs, a predetermined number of the candidate encoding modes; selecting, based on the respective encoding costs of the at least some encoding modes, a best mode for encoding the block; and encoding, in a compressed bitstream, the block using the best mode.

Type: Grant

Filed: February 4, 2019

Date of Patent: November 24, 2020

Assignee: GOOGLE LLC

Inventors: Urvang Joshi, Debargha Mukherjee, Hui Su
Transforms for large video and image blocks

Patent number: 10771783

Abstract: Improved transforms are used to encode and decode large video and image blocks. During encoding, a prediction residual block having a large size (e.g., larger than 32×32) is generated. The pixel values of the prediction residual block are transformed to produce transform coefficients. After determining that the transform coefficients exceed a threshold cardinality representative of a maximum transform block size (e.g., 32×32), a number of the transform coefficients are discarded such that a remaining number of transform coefficients does not exceed the threshold cardinality. A transform block is then generated using the remaining number. During decoding, after determining that the transform coefficients exceed the threshold cardinality, a number of new coefficients are added to the transform coefficients such that a total number of transform coefficients exceeds the threshold cardinality. The transform coefficients are then inverse transformed into a prediction residual block having a large size.

Type: Grant

Filed: June 11, 2018

Date of Patent: September 8, 2020

Assignee: GOOGLE LLC

Inventors: Urvang Joshi, Debargha Mukherjee

1 2 next