Patents by Inventor Lihua Zhu
Lihua Zhu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10469863Abstract: Innovations in the area of prediction of block vector (“BV”) values improve encoding or decoding of blocks using intra block copy (“BC”) prediction. For example, some of the innovations relate to use of a default BV predictor with a non-zero value. Other innovations relate to use of a selected one of multiple BV predictor candidates for a current block. Still other innovations relate to use of a skip mode in which a current intra-BC-predicted block uses a predicted BV value.Type: GrantFiled: January 3, 2014Date of Patent: November 5, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Lihua Zhu, Gary J. Sullivan, Jizheng Xu, Sridhar Sankuratri, B. Anil Kumar, Feng Wu
-
Publication number: 20190320207Abstract: In an implementation, a supplemental sequence parameter set (“SPS”) structure is provided that has its own network abstraction layer (“NAL”) unit type and allows transmission of layer-dependent parameters for non-base layers in an SVC environment. The supplemental SPS structure also may be used for view information in an MVC environment. In a general aspect, a structure is provided that includes (1) information (1410) from an SPS NAL unit, the information describing a parameter for use in decoding a first-layer encoding of a sequence of images, and (2) information (1420) from a supplemental SPS NAL unit having a different structure than the SPS NAL unit, and the information from the supplemental SPS NAL unit describing a parameter for use in decoding a second-layer encoding of the sequence of images. Associated methods and apparatuses are provided on the encoder and decoder sides, as well as for the signal.Type: ApplicationFiled: April 25, 2019Publication date: October 17, 2019Inventors: Lihua Zhu, Jiancong Luo, Peng Yin, Jiheng Yang
-
Publication number: 20190289310Abstract: Innovations in syntax and semantics of coded picture buffer removal delay (“CPBRD”) values potentially simplify splicing operations. For example, a video encoder sets a CPBRD value for a current picture that indicates an increment value relative to a nominal coded picture buffer removal time of a preceding picture in decoding order, regardless of whether the preceding picture has a buffering period SEI message. The encoder can signal the CPBRD value according to a single-value approach in which a flag indicates how to interpret the CPBRD value, according to a two-value approach in which another CPBRD value (having a different interpretation) is also signaled, or according to a two-value approach that uses a flag and a delta value. A corresponding video decoder receives and parses the CPBRD value for the current picture. A splicing tool can perform simple concatenation operations to splice bitstreams using the CPBRD value for the current picture.Type: ApplicationFiled: June 3, 2019Publication date: September 19, 2019Applicant: Microsoft Technology Licensing, LLCInventors: Gary J. Sullivan, Lihua Zhu
-
Patent number: 10390034Abstract: Innovations in encoder-side options for intra block copy (“BC”) prediction mode facilitate intra BC prediction that is more effective in terms of rate-distortion performance and/or computational efficiency of encoding. For example, some of the innovations relate to estimation of sample values within an overlap area of a current block during block vector estimation. Other innovations relate to prediction of block vector (“BV”) values during encoding or decoding using “ping-pong” approaches.Type: GrantFiled: March 21, 2014Date of Patent: August 20, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Lihua Zhu, Gary J. Sullivan, Yongjun Wu
-
Patent number: 10390039Abstract: Innovations in motion estimation adapted for screen remoting scenarios are described herein. For example, as part of motion estimation for a current picture, a video encoder finds a pivot point in the current picture, calculates a hash value for the pivot point, and searches for a matching area in a previous picture. In doing so, the video encoder can calculate a hash index from the hash value and look up the hash index in a data structure to find candidate pivot points in the previous picture. The video encoder can compare the hash value for the pivot point in the current picture to a hash value for a candidate pivot point in the previous picture and, when the hash values match, compare sample values around the respective pivot points. In this way, the video encoder can quickly detect large areas of exact-match blocks having uniform motion.Type: GrantFiled: August 31, 2016Date of Patent: August 20, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Lihua Zhu, B. Anil Kumar, Olof L. E. Mases
-
Patent number: 10313698Abstract: Innovations in syntax and semantics of coded picture buffer removal delay (“CPBRD”) values potentially simplify splicing operations. For example, a video encoder sets a CPBRD value for a current picture that indicates an increment value relative to a nominal coded picture buffer removal time of a preceding picture in decoding order, regardless of whether the preceding picture has a buffering period SEI message. The encoder can signal the CPBRD value according to a single-value approach in which a flag indicates how to interpret the CPBRD value, according to a two-value approach in which another CPBRD value (having a different interpretation) is also signaled, or according to a two-value approach that uses a flag and a delta value. A corresponding video decoder receives and parses the CPBRD value for the current picture. A splicing tool can perform simple concatenation operations to splice bitstreams using the CPBRD value for the current picture.Type: GrantFiled: May 22, 2017Date of Patent: June 4, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Gary J. Sullivan, Lihua Zhu
-
Publication number: 20190158881Abstract: Syntax structures that indicate the completion of coded regions of pictures are described. For example, a syntax structure in an elementary bitstream indicates the completion of a coded region of a picture. The syntax structure can be a type of network abstraction layer unit, a type of supplemental enhancement information message or another syntax structure. For example, a media processing tool such as an encoder can detect completion of a coded region of a picture, then output, in a predefined order in an elementary bitstream, syntax structure(s) that contain the coded region as well as a different syntax structure that indicates the completion of the coded region. Another media processing tool such as a decoder can receive, in a predefined order in an elementary bitstream, syntax structure(s) that contain a coded region of a picture as well as a different syntax structure that indicates the completion of the coded region.Type: ApplicationFiled: January 4, 2019Publication date: May 23, 2019Applicant: Microsoft Technology Licensing, LLCInventors: Yongjun Wu, Lihua Zhu, Shyam Sadhwani, Gary J. Sullivan
-
Patent number: 10237566Abstract: A GPU loads point sprites that represent coded blocks of transform coefficients of one or more frames encoded in a bitstream and loads a transform kernel as a transform kernel texture. The GPU constructs an output frame using an inverse transform on the coded blocks of transform coefficients by transforming the point sprites with the transform kernel texture and by optionally dequantizing the point sprites. A single render pass may be used in which the rasterization formula performs the inverse transform and optionally dequantization. To preserve bandwidth, a CPU may refrain from sending the GPU at least some zero valued transform coefficients for the point sprites. Also, to reduce processing, the transform coefficients can remain in a zig-zag arrangement. The transform kernel texture used in the decoding can correspond to a modified version of the basis matrices used to encode the frame, which compensates for the zig-zag arrangement.Type: GrantFiled: April 1, 2016Date of Patent: March 19, 2019Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Lihua Zhu, Guosheng Sun, B Anil Kumar, Shir Aharon
-
Publication number: 20190075310Abstract: There are provided methods and apparatus for video usability information (VUI) for scalable video coding (SVC). An apparatus includes an encoder (100) for encoding video signal data into a bitstream. The encoder specifies video user information, excluding hypothetical reference decoder parameters, in the bitstream using a high level syntax element. The video user information corresponds to a set of interoperability points in the bitstream relating to scalable video coding (340, 355).Type: ApplicationFiled: November 5, 2018Publication date: March 7, 2019Inventors: Jiancong Luo, Peng Yin, Lihua Zhu
-
Patent number: 10205966Abstract: Syntax structures that indicate the completion of coded regions of pictures are described. For example, a syntax structure in an elementary bitstream indicates the completion of a coded region of a picture. The syntax structure can be a type of network abstraction layer unit, a type of supplemental enhancement information message or another syntax structure. For example, a media processing tool such as an encoder can detect completion of a coded region of a picture, then output, in a predefined order in an elementary bitstream, syntax structure(s) that contain the coded region as well as a different syntax structure that indicates the completion of the coded region. Another media processing tool such as a decoder can receive, in a predefined order in an elementary bitstream, syntax structure(s) that contain a coded region of a picture as well as a different syntax structure that indicates the completion of the coded region.Type: GrantFiled: September 22, 2017Date of Patent: February 12, 2019Assignee: Microsoft Technology Licensing, LLCInventors: Yongjun Wu, Lihua Zhu, Shyam Sadhwani, Gary J. Sullivan
-
Patent number: 10157480Abstract: Innovations in video decoding and rendering operations for inter-coded blocks in a graphics pipeline, in which at least some of the operations are performed using a graphics processing unit (“GPU”), are described. For example, a video playback tool receives encoded data for a current picture and performs operations to decode the encoded data and reconstruct the current picture. For a given inter-coded block of the current picture, a graphics primitive represents texture values as a point for processing by the GPU. The graphics primitive can have one or more attributes, including a motion vector, a block size, a display index value (indicating a location in a display buffer), and/or a residual index value (indicating a location of residual values). The operations performed by the video playback tool can include interpolation of sample values at fractional-sample offsets and motion compensation performed for inter-coded blocks in multiple passes for different block sizes.Type: GrantFiled: June 24, 2016Date of Patent: December 18, 2018Assignee: Microsoft Technology Licensing, LLCInventors: Lihua Zhu, B. Anil Kumar, Guosheng Sun, Olof L. E. Mases
-
Patent number: 10154272Abstract: There are provided methods and apparatus for video usability information (VUI) for scalable video coding (SVC). An apparatus includes an encoder (100) for encoding video signal data into a bitstream. The encoder specifies video user information, excluding hypothetical reference decoder parameters, in the bitstream using a high level syntax element. The video user information corresponds to a set of interoperability points in the bitstream relating to scalable video coding (340, 355).Type: GrantFiled: October 9, 2017Date of Patent: December 11, 2018Assignee: InterDigital VC Holdings Inc.Inventors: Jiancong Luo, Peng Yin, Lihua Zhu
-
Patent number: 10091504Abstract: Variations of rho-domain rate control for video encoding or other media encoding are presented. For example, in some of the variations, an encoder sets a rho value for a unit of media based at least in part on a bit allocation for the unit. The encoder also computes transform coefficients for the unit using a frequency transform having multiple location-dependent scale factors, sets a value of quantization parameter (“QP”) for the unit using a mapping of QP values to rho values, and uses the value of QP for the unit during quantization of the transform coefficients of the unit. When the QP-rho mapping is determined, a location-independent scale factor that approximates the multiple location-dependent scale factors is used and/or certain scaling operations are integrated, which reduces computational complexity while still supporting accurate rate control decisions. Implementations of such variations of rate control can exploit opportunities for caching and parallel computation.Type: GrantFiled: January 8, 2015Date of Patent: October 2, 2018Assignee: Microsoft Technology Licensing, LLCInventors: Lihua Zhu, Shir Aharon, B. Anil Kumar, Sridhar Sankuratri, Jeroen E. van Eesteren, Costin Hagiu
-
Publication number: 20180234686Abstract: Video frames of a higher-resolution chroma sampling format such as YUV 4:4:4 are packed into video frames of a lower-resolution chroma sampling format such as YUV 4:2:0 for purposes of video encoding. For example, sample values for a frame in YUV 4:4:4 format are packed into two frames in YUV 4:2:0 format. After decoding, the video frames of the lower-resolution chroma sampling format can be unpacked to reconstruct the video frames of the higher-resolution chroma sampling format. In this way, available encoders and decoders operating at the lower-resolution chroma sampling format can be used, while still retaining higher resolution chroma information. In example implementations, frames in YUV 4:4:4 format are packed into frames in YUV 4:2:0 format such that geometric correspondence is maintained between Y, U and V components for the frames in YUV 4:2:0 format.Type: ApplicationFiled: April 17, 2018Publication date: August 16, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Lihua Zhu, Sridhar Sankuratri, B. Anil Kumar, Yongjun Wu, Sandeep Kanumuri, Shyam Sadhwani, Gary J. Sullivan
-
Patent number: 10044974Abstract: Innovations in encoding of video pictures in a high-resolution chroma sampling format (such as YUV 4:4:4) using a video encoder operating on coded pictures in a low-resolution chroma sampling format (such as YUV 4:2:0) are presented. For example, according to a set of decision rules, high chroma resolution details are selectively encoded on a region-by-region basis such that increases in bit rate (due to encoding of sample values for the high chroma resolution details) happen when and where corresponding increases in chroma resolution are likely to improve quality in noticeable ways. In this way, available encoders operating on coded pictures in the low-resolution chroma sampling format can be effectively used to provide high chroma resolution details.Type: GrantFiled: November 27, 2017Date of Patent: August 7, 2018Assignee: Microsoft Technology Licensing, LLCInventors: Shir Aharon, Lihua Zhu, B. Anil Kumar, Jeroen E. van Eesteren
-
Publication number: 20180152699Abstract: Innovations in motion estimation adapted for screen remoting scenarios are described. For example, a video encoder calculates a hash value for a current block in a current picture. The video encoder searches, subject to a spatial constraint, for a matching block in a reference picture (e.g., the previous picture in display order) based at least in part on the hash value for the current block. The spatial constraint defines a search area in the reference picture within which hash values for candidate blocks in the reference picture may be compared to the hash value for the current block. By using a spatial constraint to limit the range of the local hash-based motion estimation, the video encoder can speed up the motion estimation process while still considering the candidate blocks in the reference picture that are most likely to match the current block.Type: ApplicationFiled: November 30, 2016Publication date: May 31, 2018Applicant: Microsoft Technology Licensing, LLCInventors: B. Anil Kumar, Winston M. Johnston, Olof L.E. Mases, Shir Aharon, Lihua Zhu
-
Patent number: 9979960Abstract: Video frames of a higher-resolution chroma sampling format such as YUV 4:4:4 are packed into video frames of a lower-resolution chroma sampling format such as YUV 4:2:0 for purposes of video encoding. For example, sample values for a frame in YUV 4:4:4 format are packed into two frames in YUV 4:2:0 format. After decoding, the video frames of the lower-resolution chroma sampling format can be unpacked to reconstruct the video frames of the higher-resolution chroma sampling format. In this way, available encoders and decoders operating at the lower-resolution chroma sampling format can be used, while still retaining higher resolution chroma information. In example implementations, frames in YUV 4:4:4 format are packed into frames in YUV 4:2:0 format such that geometric correspondence is maintained between Y, U and V components for the frames in YUV 4:2:0 format.Type: GrantFiled: September 13, 2013Date of Patent: May 22, 2018Assignee: Microsoft Technology Licensing, LLCInventors: Lihua Zhu, Sridhar Sankuratri, B. Anil Kumar, Yongjun Wu, Sandeep Kanumuri, Shyam Sadhwani, Gary J. Sullivan
-
Publication number: 20180103261Abstract: Innovations in video playback using a browser-based video decoder are described. In a computer system that includes multiple central processing units (“CPUs”), a browser-based video decoder performs operations with multiple threads that may execute simultaneously on different CPUs. The video decoder can perform decoding operations in parallel for different sections of a picture. For example, with a main CPU thread associated with a browser, the video decoder performs a first decoding workload (e.g., bitstream parsing) for a picture. With auxiliary CPU threads associated with Web workers and simultaneously executing on different CPUs, the video decoder performs a second decoding workload (e.g., entropy decoding, decoding of side information) for different sections of the picture, one section per auxiliary CPU thread. If the computer system also includes a graphics processing unit (“GPU”), the video decoder can perform additional decoding workloads with shader routines executable on the GPU.Type: ApplicationFiled: October 7, 2016Publication date: April 12, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Jingyaw Sun, Winston M.P. Johnston, Jayashree Sadagopan, Lihua Zhu, Michael E. Seydl, Olof L.E. Mases, B. Anil Kumar
-
Publication number: 20180091764Abstract: Innovations in encoding of video pictures in a high-resolution chroma sampling format (such as YUV 4:4:4) using a video encoder operating on coded pictures in a low-resolution chroma sampling format (such as YUV 4:2:0) are presented. For example, according to a set of decision rules, high chroma resolution details are selectively encoded on a region-by-region basis such that increases in bit rate (due to encoding of sample values for the high chroma resolution details) happen when and where corresponding increases in chroma resolution are likely to improve quality in noticeable ways. In this way, available encoders operating on coded pictures in the low-resolution chroma sampling format can be effectively used to provide high chroma resolution details.Type: ApplicationFiled: November 27, 2017Publication date: March 29, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Shir Aharon, Lihua Zhu, B. Anil Kumar, Jeroen E. van Eesteren
-
Publication number: 20180063540Abstract: Innovations in motion estimation adapted for screen remoting scenarios are described herein. For example, as part of motion estimation for a current picture, a video encoder finds a pivot point in the current picture, calculates a hash value for the pivot point, and searches for a matching area in a previous picture. In doing so, the video encoder can calculate a hash index from the hash value and look up the hash index in a data structure to find candidate pivot points in the previous picture. The video encoder can compare the hash value for the pivot point in the current picture to a hash value for a candidate pivot point in the previous picture and, when the hash values match, compare sample values around the respective pivot points. In this way, the video encoder can quickly detect large areas of exact-match blocks having uniform motion.Type: ApplicationFiled: August 31, 2016Publication date: March 1, 2018Applicant: Microsoft Technology Licensing, LLCInventors: Lihua Zhu, B. Anil Kumar, Olof L.E. Mases