Patents by Inventor Xiaojin Shi

Xiaojin Shi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9811721
    Abstract: In the field of Human-computer interaction (HCI), i.e., the study of the interfaces between people (i.e., users) and computers, understanding the intentions and desires of how the user wishes to interact with the computer is a very important problem. The ability to understand human gestures, and, in particular, hand gestures, as they relate to HCI, is a very important aspect in understanding the intentions and desires of the user in a wide variety of applications. In this disclosure, a novel system and method for three-dimensional hand tracking using depth sequences is described. Some of the major contributions of the hand tracking system described herein include: 1.) a robust hand detector that is invariant to scene background changes; 2.) a bi-directional tracking algorithm that prevents detected hands from always drifting closer to the front of the scene (i.e., forward along the z-axis of the scene); and 3.) various hand verification heuristics.
    Type: Grant
    Filed: May 7, 2015
    Date of Patent: November 7, 2017
    Assignee: Apple Inc.
    Inventors: Feng Tang, Ang Li, Xiaojin Shi
  • Publication number: 20170090584
    Abstract: Varying embodiments of intelligent systems are disclosed that respond to user intent and desires based upon activity that may or may not be expressly directed at the intelligent system. In some embodiments, the intelligent system acquires a depth image of a scene surrounding the system. A scene geometry may be extracted from the depth image and elements of the scene, such as walls, furniture, and humans may be evaluated and monitored. In certain embodiments, user activity in the scene is monitored and analyzed to infer user desires or intent with respect to the system. The interpretation of the user's intent or desire as well as the system's response may be affected by the scene geometry surrounding the user and/or the system. In some embodiments, techniques and systems are disclosed for interpreting express user communication, for example, expressed through fine hand gesture movements.
    Type: Application
    Filed: September 25, 2015
    Publication date: March 30, 2017
    Inventors: Feng Tang, Chong Chen, Haitao Guo, Xiaojin Shi, Thorsten Gernoth
  • Patent number: 9380312
    Abstract: A block input component of a video encoding pipeline may, for a block of pixels in a video frame, compute gradients in multiple directions, and may accumulate counts of the computed gradients in one or more histograms. The block input component may analyze the histogram(s) to compute block-level statistics and determine whether a dominant gradient direction exists in the block, indicating the likelihood that it represents an image containing text. If text is likely, various encoding parameter values may be selected to improve the quality of encoding for the block (e.g., by lowering a quantization parameter value). The computed statistics or selected encoding parameter values may be passed to other stages of the pipeline, and used to bias or control selection of a prediction mode, an encoding mode, or a motion vector. Frame-level or slice-level parameter values may be generated from gradient histograms of multiple blocks.
    Type: Grant
    Filed: July 14, 2014
    Date of Patent: June 28, 2016
    Assignee: Apple Inc.
    Inventors: Guy Cote, Xiaojin Shi
  • Patent number: 9332309
    Abstract: An error recovery method may be engaged by an encoder to recover from misalignment between reference picture caches at the encoder and decoder. When a communication error is detected between a coder and a decoder, a number of non-acknowledged reference frames present in the decoder's reference picture cache may be estimated. Thereafter, frames may be coded as reference frames in a number greater or equal to the number of non-acknowledged reference frames that are estimated to be present in the decoder's reference picture cache. Thereafter, ordinary coding operations may resume. Typically, a final reference frame that is coded in the error recovery mode will be coded as a synchronization frame that has high coding quality. The coded reference frames that precede it may be coded at low quality (or may be coded as SKIP-coded frames).
    Type: Grant
    Filed: September 28, 2012
    Date of Patent: May 3, 2016
    Assignee: Apple Inc.
    Inventors: Athanasios Leontaris, Haitao Guo, Xiaojin Shi
  • Patent number: 9313488
    Abstract: Disclosed is a system and method of controlling a video decoder, including a reviewing channel data representing coded video data generated by an encoder to identify parameters of a hypothetical reference decoder (HRD) used by the encoder during coding operations. A parameter representing an exit data rate requirement of a coded picture buffer (CPB) of the HRD is compared against exit rate performance of the video decoder. If the exit rate performance of the video coder matches the exit rate requirement of the HRD, the coded video data is decoded, otherwise, a certain decoding degradation scheme can be applied, including disabling decoder from decoding the coded video data.
    Type: Grant
    Filed: February 7, 2014
    Date of Patent: April 12, 2016
    Assignee: Apple Inc.
    Inventors: Hsi-Jung Wu, Barin Geoffry Haskell, Xiaojin Shi, James Oliver Normile
  • Publication number: 20160048726
    Abstract: In the field of Human-computer interaction (HCI), i.e., the study of the interfaces between people (i.e., users) and computers, understanding the intentions and desires of how the user wishes to interact with the computer is a very important problem. The ability to understand human gestures, and, in particular, hand gestures, as they relate to HCI, is a very important aspect in understanding the intentions and desires of the user in a wide variety of applications. In this disclosure, a novel system and method for three-dimensional hand tracking using depth sequences is described. Some of the major contributions of the hand tracking system described herein include: 1.) a robust hand detector that is invariant to scene background changes; 2.) a bi-directional tracking algorithm that prevents detected hands from always drifting closer to the front of the scene (i.e., forward along the z-axis of the scene); and 3.) various hand verification heuristics.
    Type: Application
    Filed: May 7, 2015
    Publication date: February 18, 2016
    Inventors: Feng Tang, Ang Li, Xiaojin Shi
  • Publication number: 20160014421
    Abstract: A block input component of a video encoding pipeline may, for a block of pixels in a video frame, compute gradients in multiple directions, and may accumulate counts of the computed gradients in one or more histograms. The block input component may analyze the histogram(s) to compute block-level statistics and determine whether a dominant gradient direction exists in the block, indicating the likelihood that it represents an image containing text. If text is likely, various encoding parameter values may be selected to improve the quality of encoding for the block (e.g., by lowering a quantization parameter value). The computed statistics or selected encoding parameter values may be passed to other stages of the pipeline, and used to bias or control selection of a prediction mode, an encoding mode, or a motion vector. Frame-level or slice-level parameter values may be generated from gradient histograms of multiple blocks.
    Type: Application
    Filed: July 14, 2014
    Publication date: January 14, 2016
    Applicant: APPLE INC.
    Inventors: Guy Cote, Xiaojin Shi
  • Publication number: 20150350688
    Abstract: Methods and systems provide video compression to reduce a “flashing” effect, typically caused by skipping coding or allocating a low number of bits in coding relatively low complexity portions of frames. In an embodiment, if at least a portion of a sequence of frames is of relatively low complexity, a history of coding blocks may be considered to determine whether to skip coding. In an embodiment, a number of coding bits allocated to a block may be increased based on a history of the coding block and a likelihood of flashing. The history of coding of each pixel block may be a basis for forcing a higher quantization parameter coding of pixel block(s) of high motion portions such that a low bit rate is maintained despite a larger number of bits being allocated to flashing-susceptible blocks. In another embodiment, force coding of relatively low complexity portions may be delayed by a number of frames.
    Type: Application
    Filed: April 24, 2015
    Publication date: December 3, 2015
    Inventors: Jian Lou, Xiaojin Shi
  • Publication number: 20150341659
    Abstract: A pipelined video coding system may include a motion estimation stage and an encoding stage. The motion estimation stage may operate on an input frame of video data in a first stage of operation and may generate estimates of motion and other statistical analyses. The encoding stage may operate on the input frame of video data in a second stage of operation later than the first stage. The encoding stage may perform predictive coding using coding parameters that are selected, at least in part, from the estimated motion and statistical analysis generated by the motion estimator. Because the motion estimation is performed at a processing stage that precedes the encoding, a greater amount of processing time may be devoted to such processes than in systems that performed both operations in a single processing stage.
    Type: Application
    Filed: April 24, 2015
    Publication date: November 26, 2015
    Inventors: Jian Lou, Xiaojin Shi, Jian Zhou
  • Publication number: 20150003515
    Abstract: Techniques for encoding data based at least in part upon an awareness of the decoding complexity of the encoded data and the ability of a target decoder to decode the encoded data are disclosed. In some embodiments, a set of data is encoded based at least in part upon a state of a target decoder to which the encoded set of data is to be provided. In some embodiments, a set of data is encoded based at least in part upon the states of multiple decoders to which the encoded set of data is to be provided.
    Type: Application
    Filed: September 5, 2014
    Publication date: January 1, 2015
    Inventors: Jim Normile, Thomas Pun, Xiaojin Shi, Xin Tong, Hsi-Jung Wu
  • Patent number: 8855213
    Abstract: Embodiments of the present invention provides a method and device for processing a source video. The method and device may provide computing an artifact estimation from differences among pixels selected from spatially-distributed sampling patterns in the source video; filtering the source video to produce a filtered version of the source video, computing a blending factor based on the artifact estimation in the source video, and computing an output video by blending the source video and the filtered version of the source video based on the blending factor.
    Type: Grant
    Filed: May 11, 2009
    Date of Patent: October 7, 2014
    Assignee: Apple Inc.
    Inventors: Gianluca Filippini, Xiaosong Zhou, Hsi-Jung Wu, James Oliver Normile, Xiaojin Shi, Ionut Hristodorescu
  • Patent number: 8830092
    Abstract: Techniques for encoding data based at least in part upon an awareness of the decoding complexity of the encoded data and the ability of a target decoder to decode the encoded data are disclosed. In some embodiments, a set of data is encoded based at least in part upon a state of a target decoder to which the encoded set of data is to be provided. In some embodiments, a set of data is encoded based at least in part upon the states of multiple decoders to which the encoded set of data is to be provided.
    Type: Grant
    Filed: June 9, 2011
    Date of Patent: September 9, 2014
    Assignee: Apple Inc.
    Inventors: Jim Normile, Thomas Pun, Xiaojin Shi, Xin Tong, Hsi-Jung Wu
  • Patent number: 8780986
    Abstract: Apparatuses and methods for improving coding processes and coding parameters for coding video data are provided for. A coder may select coding parameters for video data according to a default coding policy. The default coding policy may include selection of prediction modes (e.g., intra-coding or inter-coding) for each pixel group in each frame. A video coder may select some pixel groups in a frame to be coded as refresh pixel groups as an exception to the default assignment policies. The selection of refresh pixel groups may be based on prediction relationships among multiple frames of source video data. The default coding of the refresh pixel groups is then modified to enhanced the coding of the refresh pixel groups. The refresh pixel groups may permit fewer intra (I) frames be sent and/or may improve the quality of the recovered video.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: July 15, 2014
    Assignee: Apple Inc.
    Inventors: Hsi-Jung Wu, Xiaosong Zhou, Xiaojin Shi, Yuxin Liu
  • Publication number: 20140153653
    Abstract: Disclosed is a system and method of controlling a video decoder, including a reviewing channel data representing coded video data generated by an encoder to identify parameters of a hypothetical reference decoder (HRD) used by the encoder during coding operations. A parameter representing an exit data rate requirement of a coded picture buffer (CPB) of the HRD is compared against exit rate performance of the video decoder. If the exit rate performance of the video coder matches the exit rate requirement of the HRD, the coded video data is decoded, otherwise, a certain decoding degradation scheme can be applied, including disabling decoder from decoding the coded video data.
    Type: Application
    Filed: February 7, 2014
    Publication date: June 5, 2014
    Applicant: APPLE INC.
    Inventors: Hsi-Jung Wu, Barin Geoffry Haskell, Xiaojin Shi, James Oliver Normile
  • Patent number: 8675740
    Abstract: Disclosed is a system and method of controlling a video decoder, including a reviewing channel data representing coded video data generated by an encoder to identify parameters of a hypothetical reference decoder (HRD) used by the encoder during coding operations. A parameter representing an exit data rate requirement of a coded picture buffer (CPB) of the HRD is compared against exit rate performance of the video decoder. If the exit rate performance of the video coder matches the exit rate requirement of the HRD, the coded video data is decoded, otherwise, a certain decoding degradation scheme can be applied, including disabling decoder from decoding the coded video data.
    Type: Grant
    Filed: December 31, 2012
    Date of Patent: March 18, 2014
    Assignee: Apple Inc.
    Inventors: Hsi-Jung Wu, Barin Geoffry Haskell, Xiaojin Shi, James Oliver Normile
  • Patent number: 8638851
    Abstract: A video coding system and method for increasing a transmitted output bit rate of a video encoding system by altering the content of the bit stream. A video encoder may receive a coding mode signal from a computer application for coding source video data, the coding mode signal indicating a target bit rate having a risk factor related to transmission error associated to the target bit rate. The coded bitstream may be modified based on the risk factor indicated in the coding mode signal. A modified coded bitstream may be outputted at the target bit rate and at a reduced coding efficiency, and the channel may be tested for transmission errors. Based on the test results, a revised coding mode signal indicating the same target bit rate, but a revised risk factor may be provided. The coded bitstream may be revised by removing the modifications previously made to the coded bitstream and a revised coded bitstream having greater coding efficiency may be output at the target bit rate.
    Type: Grant
    Filed: December 23, 2009
    Date of Patent: January 28, 2014
    Assignee: Apple Inc.
    Inventors: Hyeonkuk Jeong, Xiaosong Zhou, Joe Abuan, Xiaojin Shi, Hsi-Jung Wu, James Oliver Normile
  • Publication number: 20130329809
    Abstract: An error recovery method may be engaged by an encoder to recover from misalignment between reference picture caches at the encoder and decoder. When a communication error is detected between a coder and a decoder, a number of non-acknowledged reference frames present in the decoder's reference picture cache may be estimated. Thereafter, frames may be coded as reference frames in a number greater or equal to the number of non-acknowledged reference frames that are estimated to be present in the decoder's reference picture cache. Thereafter, ordinary coding operations may resume. Typically, a final reference frame that is coded in the error recovery mode will be coded as a synchronization frame that has high coding quality. The coded reference frames that precede it may be coded at low quality (or may be coded as SKIP-coded frames).
    Type: Application
    Filed: September 28, 2012
    Publication date: December 12, 2013
    Applicant: APPLE INC.
    Inventors: Athanasios Leontaris, Haitao Guo, Xiaojin Shi
  • Patent number: 8599238
    Abstract: Methods, systems, and apparatus are presented for reducing distortion in an image, such as a video image. A video image can be captured by an image capture device, e.g. during a video conferencing session. Distortion correction processing, such as the application of one or more warping techniques, can be applied to the captured image to produce a distortion corrected image, which can be transmitted to one or more participants. The warping techniques can be performed in accordance with one or more warp parameters specifying a transformation of the captured image. Further, the warp parameters can be generated in accordance with an orientation of the image capture device, which can be determined based on sensor data or can be a fixed value. Additionally or alternatively, the warp parameters can be determined in accordance with a reference image or model to which the captured image should be warped.
    Type: Grant
    Filed: October 16, 2009
    Date of Patent: December 3, 2013
    Assignee: Apple Inc.
    Inventors: Hsi-Jung Wu, Chris Yoochang Chung, Xiaojin Shi, James Normile
  • Patent number: 8558903
    Abstract: Embodiments of the present invention provide a control system for video processes that selectively control the operation of motion stabilization processes. According to the present invention, motion sensor data indicative of motion of a mobile device may be received and processed. A determination may be made by comparing processed motion sensor data to a threshold. Based on the determination, motion stabilization may be suspended on select portions of a captured video sequence.
    Type: Grant
    Filed: April 7, 2010
    Date of Patent: October 15, 2013
    Assignee: Apple Inc.
    Inventors: Yuxin Liu, Xiaojin Shi, James Oliver Normile, Hsi-Jung Wu
  • Patent number: 8345774
    Abstract: Disclosed is a system and method of controlling a video decoder, including a reviewing channel data representing coded video data generated by an encoder to identify parameters of a hypothetical reference decoder (HRD) used by the encoder during coding operations. A parameter representing an exit data rate requirement of a coded picture buffer (CPB) of the HRD is compared against exit rate performance of the video decoder. If the exit rate performance of the video coder matches the exit rate requirement of the HRD, the coded video data is decoded, otherwise, a certain decoding degradation scheme can be applied, including disabling decoder from decoding the coded video data.
    Type: Grant
    Filed: January 11, 2008
    Date of Patent: January 1, 2013
    Assignee: Apple Inc.
    Inventors: Hsi-Jung Wu, Barin Geoffry Haskell, Xiaojin Shi, James Oliver Normile