Patents by Inventor Lihua Zhu

Lihua Zhu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10469863
    Abstract: Innovations in the area of prediction of block vector (“BV”) values improve encoding or decoding of blocks using intra block copy (“BC”) prediction. For example, some of the innovations relate to use of a default BV predictor with a non-zero value. Other innovations relate to use of a selected one of multiple BV predictor candidates for a current block. Still other innovations relate to use of a skip mode in which a current intra-BC-predicted block uses a predicted BV value.
    Type: Grant
    Filed: January 3, 2014
    Date of Patent: November 5, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lihua Zhu, Gary J. Sullivan, Jizheng Xu, Sridhar Sankuratri, B. Anil Kumar, Feng Wu
  • Publication number: 20190320207
    Abstract: In an implementation, a supplemental sequence parameter set (“SPS”) structure is provided that has its own network abstraction layer (“NAL”) unit type and allows transmission of layer-dependent parameters for non-base layers in an SVC environment. The supplemental SPS structure also may be used for view information in an MVC environment. In a general aspect, a structure is provided that includes (1) information (1410) from an SPS NAL unit, the information describing a parameter for use in decoding a first-layer encoding of a sequence of images, and (2) information (1420) from a supplemental SPS NAL unit having a different structure than the SPS NAL unit, and the information from the supplemental SPS NAL unit describing a parameter for use in decoding a second-layer encoding of the sequence of images. Associated methods and apparatuses are provided on the encoder and decoder sides, as well as for the signal.
    Type: Application
    Filed: April 25, 2019
    Publication date: October 17, 2019
    Inventors: Lihua Zhu, Jiancong Luo, Peng Yin, Jiheng Yang
  • Publication number: 20190289310
    Abstract: Innovations in syntax and semantics of coded picture buffer removal delay (“CPBRD”) values potentially simplify splicing operations. For example, a video encoder sets a CPBRD value for a current picture that indicates an increment value relative to a nominal coded picture buffer removal time of a preceding picture in decoding order, regardless of whether the preceding picture has a buffering period SEI message. The encoder can signal the CPBRD value according to a single-value approach in which a flag indicates how to interpret the CPBRD value, according to a two-value approach in which another CPBRD value (having a different interpretation) is also signaled, or according to a two-value approach that uses a flag and a delta value. A corresponding video decoder receives and parses the CPBRD value for the current picture. A splicing tool can perform simple concatenation operations to splice bitstreams using the CPBRD value for the current picture.
    Type: Application
    Filed: June 3, 2019
    Publication date: September 19, 2019
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Gary J. Sullivan, Lihua Zhu
  • Patent number: 10390034
    Abstract: Innovations in encoder-side options for intra block copy (“BC”) prediction mode facilitate intra BC prediction that is more effective in terms of rate-distortion performance and/or computational efficiency of encoding. For example, some of the innovations relate to estimation of sample values within an overlap area of a current block during block vector estimation. Other innovations relate to prediction of block vector (“BV”) values during encoding or decoding using “ping-pong” approaches.
    Type: Grant
    Filed: March 21, 2014
    Date of Patent: August 20, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lihua Zhu, Gary J. Sullivan, Yongjun Wu
  • Patent number: 10390039
    Abstract: Innovations in motion estimation adapted for screen remoting scenarios are described herein. For example, as part of motion estimation for a current picture, a video encoder finds a pivot point in the current picture, calculates a hash value for the pivot point, and searches for a matching area in a previous picture. In doing so, the video encoder can calculate a hash index from the hash value and look up the hash index in a data structure to find candidate pivot points in the previous picture. The video encoder can compare the hash value for the pivot point in the current picture to a hash value for a candidate pivot point in the previous picture and, when the hash values match, compare sample values around the respective pivot points. In this way, the video encoder can quickly detect large areas of exact-match blocks having uniform motion.
    Type: Grant
    Filed: August 31, 2016
    Date of Patent: August 20, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lihua Zhu, B. Anil Kumar, Olof L. E. Mases
  • Patent number: 10313698
    Abstract: Innovations in syntax and semantics of coded picture buffer removal delay (“CPBRD”) values potentially simplify splicing operations. For example, a video encoder sets a CPBRD value for a current picture that indicates an increment value relative to a nominal coded picture buffer removal time of a preceding picture in decoding order, regardless of whether the preceding picture has a buffering period SEI message. The encoder can signal the CPBRD value according to a single-value approach in which a flag indicates how to interpret the CPBRD value, according to a two-value approach in which another CPBRD value (having a different interpretation) is also signaled, or according to a two-value approach that uses a flag and a delta value. A corresponding video decoder receives and parses the CPBRD value for the current picture. A splicing tool can perform simple concatenation operations to splice bitstreams using the CPBRD value for the current picture.
    Type: Grant
    Filed: May 22, 2017
    Date of Patent: June 4, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Gary J. Sullivan, Lihua Zhu
  • Publication number: 20190158881
    Abstract: Syntax structures that indicate the completion of coded regions of pictures are described. For example, a syntax structure in an elementary bitstream indicates the completion of a coded region of a picture. The syntax structure can be a type of network abstraction layer unit, a type of supplemental enhancement information message or another syntax structure. For example, a media processing tool such as an encoder can detect completion of a coded region of a picture, then output, in a predefined order in an elementary bitstream, syntax structure(s) that contain the coded region as well as a different syntax structure that indicates the completion of the coded region. Another media processing tool such as a decoder can receive, in a predefined order in an elementary bitstream, syntax structure(s) that contain a coded region of a picture as well as a different syntax structure that indicates the completion of the coded region.
    Type: Application
    Filed: January 4, 2019
    Publication date: May 23, 2019
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Yongjun Wu, Lihua Zhu, Shyam Sadhwani, Gary J. Sullivan
  • Patent number: 10237566
    Abstract: A GPU loads point sprites that represent coded blocks of transform coefficients of one or more frames encoded in a bitstream and loads a transform kernel as a transform kernel texture. The GPU constructs an output frame using an inverse transform on the coded blocks of transform coefficients by transforming the point sprites with the transform kernel texture and by optionally dequantizing the point sprites. A single render pass may be used in which the rasterization formula performs the inverse transform and optionally dequantization. To preserve bandwidth, a CPU may refrain from sending the GPU at least some zero valued transform coefficients for the point sprites. Also, to reduce processing, the transform coefficients can remain in a zig-zag arrangement. The transform kernel texture used in the decoding can correspond to a modified version of the basis matrices used to encode the frame, which compensates for the zig-zag arrangement.
    Type: Grant
    Filed: April 1, 2016
    Date of Patent: March 19, 2019
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Lihua Zhu, Guosheng Sun, B Anil Kumar, Shir Aharon
  • Publication number: 20190075310
    Abstract: There are provided methods and apparatus for video usability information (VUI) for scalable video coding (SVC). An apparatus includes an encoder (100) for encoding video signal data into a bitstream. The encoder specifies video user information, excluding hypothetical reference decoder parameters, in the bitstream using a high level syntax element. The video user information corresponds to a set of interoperability points in the bitstream relating to scalable video coding (340, 355).
    Type: Application
    Filed: November 5, 2018
    Publication date: March 7, 2019
    Inventors: Jiancong Luo, Peng Yin, Lihua Zhu
  • Patent number: 10205966
    Abstract: Syntax structures that indicate the completion of coded regions of pictures are described. For example, a syntax structure in an elementary bitstream indicates the completion of a coded region of a picture. The syntax structure can be a type of network abstraction layer unit, a type of supplemental enhancement information message or another syntax structure. For example, a media processing tool such as an encoder can detect completion of a coded region of a picture, then output, in a predefined order in an elementary bitstream, syntax structure(s) that contain the coded region as well as a different syntax structure that indicates the completion of the coded region. Another media processing tool such as a decoder can receive, in a predefined order in an elementary bitstream, syntax structure(s) that contain a coded region of a picture as well as a different syntax structure that indicates the completion of the coded region.
    Type: Grant
    Filed: September 22, 2017
    Date of Patent: February 12, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yongjun Wu, Lihua Zhu, Shyam Sadhwani, Gary J. Sullivan
  • Patent number: 10157480
    Abstract: Innovations in video decoding and rendering operations for inter-coded blocks in a graphics pipeline, in which at least some of the operations are performed using a graphics processing unit (“GPU”), are described. For example, a video playback tool receives encoded data for a current picture and performs operations to decode the encoded data and reconstruct the current picture. For a given inter-coded block of the current picture, a graphics primitive represents texture values as a point for processing by the GPU. The graphics primitive can have one or more attributes, including a motion vector, a block size, a display index value (indicating a location in a display buffer), and/or a residual index value (indicating a location of residual values). The operations performed by the video playback tool can include interpolation of sample values at fractional-sample offsets and motion compensation performed for inter-coded blocks in multiple passes for different block sizes.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: December 18, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lihua Zhu, B. Anil Kumar, Guosheng Sun, Olof L. E. Mases
  • Patent number: 10154272
    Abstract: There are provided methods and apparatus for video usability information (VUI) for scalable video coding (SVC). An apparatus includes an encoder (100) for encoding video signal data into a bitstream. The encoder specifies video user information, excluding hypothetical reference decoder parameters, in the bitstream using a high level syntax element. The video user information corresponds to a set of interoperability points in the bitstream relating to scalable video coding (340, 355).
    Type: Grant
    Filed: October 9, 2017
    Date of Patent: December 11, 2018
    Assignee: InterDigital VC Holdings Inc.
    Inventors: Jiancong Luo, Peng Yin, Lihua Zhu
  • Patent number: 10091504
    Abstract: Variations of rho-domain rate control for video encoding or other media encoding are presented. For example, in some of the variations, an encoder sets a rho value for a unit of media based at least in part on a bit allocation for the unit. The encoder also computes transform coefficients for the unit using a frequency transform having multiple location-dependent scale factors, sets a value of quantization parameter (“QP”) for the unit using a mapping of QP values to rho values, and uses the value of QP for the unit during quantization of the transform coefficients of the unit. When the QP-rho mapping is determined, a location-independent scale factor that approximates the multiple location-dependent scale factors is used and/or certain scaling operations are integrated, which reduces computational complexity while still supporting accurate rate control decisions. Implementations of such variations of rate control can exploit opportunities for caching and parallel computation.
    Type: Grant
    Filed: January 8, 2015
    Date of Patent: October 2, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lihua Zhu, Shir Aharon, B. Anil Kumar, Sridhar Sankuratri, Jeroen E. van Eesteren, Costin Hagiu
  • Publication number: 20180234686
    Abstract: Video frames of a higher-resolution chroma sampling format such as YUV 4:4:4 are packed into video frames of a lower-resolution chroma sampling format such as YUV 4:2:0 for purposes of video encoding. For example, sample values for a frame in YUV 4:4:4 format are packed into two frames in YUV 4:2:0 format. After decoding, the video frames of the lower-resolution chroma sampling format can be unpacked to reconstruct the video frames of the higher-resolution chroma sampling format. In this way, available encoders and decoders operating at the lower-resolution chroma sampling format can be used, while still retaining higher resolution chroma information. In example implementations, frames in YUV 4:4:4 format are packed into frames in YUV 4:2:0 format such that geometric correspondence is maintained between Y, U and V components for the frames in YUV 4:2:0 format.
    Type: Application
    Filed: April 17, 2018
    Publication date: August 16, 2018
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Lihua Zhu, Sridhar Sankuratri, B. Anil Kumar, Yongjun Wu, Sandeep Kanumuri, Shyam Sadhwani, Gary J. Sullivan
  • Patent number: 10044974
    Abstract: Innovations in encoding of video pictures in a high-resolution chroma sampling format (such as YUV 4:4:4) using a video encoder operating on coded pictures in a low-resolution chroma sampling format (such as YUV 4:2:0) are presented. For example, according to a set of decision rules, high chroma resolution details are selectively encoded on a region-by-region basis such that increases in bit rate (due to encoding of sample values for the high chroma resolution details) happen when and where corresponding increases in chroma resolution are likely to improve quality in noticeable ways. In this way, available encoders operating on coded pictures in the low-resolution chroma sampling format can be effectively used to provide high chroma resolution details.
    Type: Grant
    Filed: November 27, 2017
    Date of Patent: August 7, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shir Aharon, Lihua Zhu, B. Anil Kumar, Jeroen E. van Eesteren
  • Publication number: 20180152699
    Abstract: Innovations in motion estimation adapted for screen remoting scenarios are described. For example, a video encoder calculates a hash value for a current block in a current picture. The video encoder searches, subject to a spatial constraint, for a matching block in a reference picture (e.g., the previous picture in display order) based at least in part on the hash value for the current block. The spatial constraint defines a search area in the reference picture within which hash values for candidate blocks in the reference picture may be compared to the hash value for the current block. By using a spatial constraint to limit the range of the local hash-based motion estimation, the video encoder can speed up the motion estimation process while still considering the candidate blocks in the reference picture that are most likely to match the current block.
    Type: Application
    Filed: November 30, 2016
    Publication date: May 31, 2018
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: B. Anil Kumar, Winston M. Johnston, Olof L.E. Mases, Shir Aharon, Lihua Zhu
  • Patent number: 9979960
    Abstract: Video frames of a higher-resolution chroma sampling format such as YUV 4:4:4 are packed into video frames of a lower-resolution chroma sampling format such as YUV 4:2:0 for purposes of video encoding. For example, sample values for a frame in YUV 4:4:4 format are packed into two frames in YUV 4:2:0 format. After decoding, the video frames of the lower-resolution chroma sampling format can be unpacked to reconstruct the video frames of the higher-resolution chroma sampling format. In this way, available encoders and decoders operating at the lower-resolution chroma sampling format can be used, while still retaining higher resolution chroma information. In example implementations, frames in YUV 4:4:4 format are packed into frames in YUV 4:2:0 format such that geometric correspondence is maintained between Y, U and V components for the frames in YUV 4:2:0 format.
    Type: Grant
    Filed: September 13, 2013
    Date of Patent: May 22, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lihua Zhu, Sridhar Sankuratri, B. Anil Kumar, Yongjun Wu, Sandeep Kanumuri, Shyam Sadhwani, Gary J. Sullivan
  • Publication number: 20180103261
    Abstract: Innovations in video playback using a browser-based video decoder are described. In a computer system that includes multiple central processing units (“CPUs”), a browser-based video decoder performs operations with multiple threads that may execute simultaneously on different CPUs. The video decoder can perform decoding operations in parallel for different sections of a picture. For example, with a main CPU thread associated with a browser, the video decoder performs a first decoding workload (e.g., bitstream parsing) for a picture. With auxiliary CPU threads associated with Web workers and simultaneously executing on different CPUs, the video decoder performs a second decoding workload (e.g., entropy decoding, decoding of side information) for different sections of the picture, one section per auxiliary CPU thread. If the computer system also includes a graphics processing unit (“GPU”), the video decoder can perform additional decoding workloads with shader routines executable on the GPU.
    Type: Application
    Filed: October 7, 2016
    Publication date: April 12, 2018
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Jingyaw Sun, Winston M.P. Johnston, Jayashree Sadagopan, Lihua Zhu, Michael E. Seydl, Olof L.E. Mases, B. Anil Kumar
  • Publication number: 20180091764
    Abstract: Innovations in encoding of video pictures in a high-resolution chroma sampling format (such as YUV 4:4:4) using a video encoder operating on coded pictures in a low-resolution chroma sampling format (such as YUV 4:2:0) are presented. For example, according to a set of decision rules, high chroma resolution details are selectively encoded on a region-by-region basis such that increases in bit rate (due to encoding of sample values for the high chroma resolution details) happen when and where corresponding increases in chroma resolution are likely to improve quality in noticeable ways. In this way, available encoders operating on coded pictures in the low-resolution chroma sampling format can be effectively used to provide high chroma resolution details.
    Type: Application
    Filed: November 27, 2017
    Publication date: March 29, 2018
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Shir Aharon, Lihua Zhu, B. Anil Kumar, Jeroen E. van Eesteren
  • Publication number: 20180063540
    Abstract: Innovations in motion estimation adapted for screen remoting scenarios are described herein. For example, as part of motion estimation for a current picture, a video encoder finds a pivot point in the current picture, calculates a hash value for the pivot point, and searches for a matching area in a previous picture. In doing so, the video encoder can calculate a hash index from the hash value and look up the hash index in a data structure to find candidate pivot points in the previous picture. The video encoder can compare the hash value for the pivot point in the current picture to a hash value for a candidate pivot point in the previous picture and, when the hash values match, compare sample values around the respective pivot points. In this way, the video encoder can quickly detect large areas of exact-match blocks having uniform motion.
    Type: Application
    Filed: August 31, 2016
    Publication date: March 1, 2018
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Lihua Zhu, B. Anil Kumar, Olof L.E. Mases