Patents by Inventor Xiaojin Shi
Xiaojin Shi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9811721Abstract: In the field of Human-computer interaction (HCI), i.e., the study of the interfaces between people (i.e., users) and computers, understanding the intentions and desires of how the user wishes to interact with the computer is a very important problem. The ability to understand human gestures, and, in particular, hand gestures, as they relate to HCI, is a very important aspect in understanding the intentions and desires of the user in a wide variety of applications. In this disclosure, a novel system and method for three-dimensional hand tracking using depth sequences is described. Some of the major contributions of the hand tracking system described herein include: 1.) a robust hand detector that is invariant to scene background changes; 2.) a bi-directional tracking algorithm that prevents detected hands from always drifting closer to the front of the scene (i.e., forward along the z-axis of the scene); and 3.) various hand verification heuristics.Type: GrantFiled: May 7, 2015Date of Patent: November 7, 2017Assignee: Apple Inc.Inventors: Feng Tang, Ang Li, Xiaojin Shi
-
Publication number: 20170090584Abstract: Varying embodiments of intelligent systems are disclosed that respond to user intent and desires based upon activity that may or may not be expressly directed at the intelligent system. In some embodiments, the intelligent system acquires a depth image of a scene surrounding the system. A scene geometry may be extracted from the depth image and elements of the scene, such as walls, furniture, and humans may be evaluated and monitored. In certain embodiments, user activity in the scene is monitored and analyzed to infer user desires or intent with respect to the system. The interpretation of the user's intent or desire as well as the system's response may be affected by the scene geometry surrounding the user and/or the system. In some embodiments, techniques and systems are disclosed for interpreting express user communication, for example, expressed through fine hand gesture movements.Type: ApplicationFiled: September 25, 2015Publication date: March 30, 2017Inventors: Feng Tang, Chong Chen, Haitao Guo, Xiaojin Shi, Thorsten Gernoth
-
Patent number: 9380312Abstract: A block input component of a video encoding pipeline may, for a block of pixels in a video frame, compute gradients in multiple directions, and may accumulate counts of the computed gradients in one or more histograms. The block input component may analyze the histogram(s) to compute block-level statistics and determine whether a dominant gradient direction exists in the block, indicating the likelihood that it represents an image containing text. If text is likely, various encoding parameter values may be selected to improve the quality of encoding for the block (e.g., by lowering a quantization parameter value). The computed statistics or selected encoding parameter values may be passed to other stages of the pipeline, and used to bias or control selection of a prediction mode, an encoding mode, or a motion vector. Frame-level or slice-level parameter values may be generated from gradient histograms of multiple blocks.Type: GrantFiled: July 14, 2014Date of Patent: June 28, 2016Assignee: Apple Inc.Inventors: Guy Cote, Xiaojin Shi
-
Patent number: 9332309Abstract: An error recovery method may be engaged by an encoder to recover from misalignment between reference picture caches at the encoder and decoder. When a communication error is detected between a coder and a decoder, a number of non-acknowledged reference frames present in the decoder's reference picture cache may be estimated. Thereafter, frames may be coded as reference frames in a number greater or equal to the number of non-acknowledged reference frames that are estimated to be present in the decoder's reference picture cache. Thereafter, ordinary coding operations may resume. Typically, a final reference frame that is coded in the error recovery mode will be coded as a synchronization frame that has high coding quality. The coded reference frames that precede it may be coded at low quality (or may be coded as SKIP-coded frames).Type: GrantFiled: September 28, 2012Date of Patent: May 3, 2016Assignee: Apple Inc.Inventors: Athanasios Leontaris, Haitao Guo, Xiaojin Shi
-
Patent number: 9313488Abstract: Disclosed is a system and method of controlling a video decoder, including a reviewing channel data representing coded video data generated by an encoder to identify parameters of a hypothetical reference decoder (HRD) used by the encoder during coding operations. A parameter representing an exit data rate requirement of a coded picture buffer (CPB) of the HRD is compared against exit rate performance of the video decoder. If the exit rate performance of the video coder matches the exit rate requirement of the HRD, the coded video data is decoded, otherwise, a certain decoding degradation scheme can be applied, including disabling decoder from decoding the coded video data.Type: GrantFiled: February 7, 2014Date of Patent: April 12, 2016Assignee: Apple Inc.Inventors: Hsi-Jung Wu, Barin Geoffry Haskell, Xiaojin Shi, James Oliver Normile
-
Publication number: 20160048726Abstract: In the field of Human-computer interaction (HCI), i.e., the study of the interfaces between people (i.e., users) and computers, understanding the intentions and desires of how the user wishes to interact with the computer is a very important problem. The ability to understand human gestures, and, in particular, hand gestures, as they relate to HCI, is a very important aspect in understanding the intentions and desires of the user in a wide variety of applications. In this disclosure, a novel system and method for three-dimensional hand tracking using depth sequences is described. Some of the major contributions of the hand tracking system described herein include: 1.) a robust hand detector that is invariant to scene background changes; 2.) a bi-directional tracking algorithm that prevents detected hands from always drifting closer to the front of the scene (i.e., forward along the z-axis of the scene); and 3.) various hand verification heuristics.Type: ApplicationFiled: May 7, 2015Publication date: February 18, 2016Inventors: Feng Tang, Ang Li, Xiaojin Shi
-
Publication number: 20160014421Abstract: A block input component of a video encoding pipeline may, for a block of pixels in a video frame, compute gradients in multiple directions, and may accumulate counts of the computed gradients in one or more histograms. The block input component may analyze the histogram(s) to compute block-level statistics and determine whether a dominant gradient direction exists in the block, indicating the likelihood that it represents an image containing text. If text is likely, various encoding parameter values may be selected to improve the quality of encoding for the block (e.g., by lowering a quantization parameter value). The computed statistics or selected encoding parameter values may be passed to other stages of the pipeline, and used to bias or control selection of a prediction mode, an encoding mode, or a motion vector. Frame-level or slice-level parameter values may be generated from gradient histograms of multiple blocks.Type: ApplicationFiled: July 14, 2014Publication date: January 14, 2016Applicant: APPLE INC.Inventors: Guy Cote, Xiaojin Shi
-
Publication number: 20150350688Abstract: Methods and systems provide video compression to reduce a “flashing” effect, typically caused by skipping coding or allocating a low number of bits in coding relatively low complexity portions of frames. In an embodiment, if at least a portion of a sequence of frames is of relatively low complexity, a history of coding blocks may be considered to determine whether to skip coding. In an embodiment, a number of coding bits allocated to a block may be increased based on a history of the coding block and a likelihood of flashing. The history of coding of each pixel block may be a basis for forcing a higher quantization parameter coding of pixel block(s) of high motion portions such that a low bit rate is maintained despite a larger number of bits being allocated to flashing-susceptible blocks. In another embodiment, force coding of relatively low complexity portions may be delayed by a number of frames.Type: ApplicationFiled: April 24, 2015Publication date: December 3, 2015Inventors: Jian Lou, Xiaojin Shi
-
Publication number: 20150341659Abstract: A pipelined video coding system may include a motion estimation stage and an encoding stage. The motion estimation stage may operate on an input frame of video data in a first stage of operation and may generate estimates of motion and other statistical analyses. The encoding stage may operate on the input frame of video data in a second stage of operation later than the first stage. The encoding stage may perform predictive coding using coding parameters that are selected, at least in part, from the estimated motion and statistical analysis generated by the motion estimator. Because the motion estimation is performed at a processing stage that precedes the encoding, a greater amount of processing time may be devoted to such processes than in systems that performed both operations in a single processing stage.Type: ApplicationFiled: April 24, 2015Publication date: November 26, 2015Inventors: Jian Lou, Xiaojin Shi, Jian Zhou
-
Publication number: 20150003515Abstract: Techniques for encoding data based at least in part upon an awareness of the decoding complexity of the encoded data and the ability of a target decoder to decode the encoded data are disclosed. In some embodiments, a set of data is encoded based at least in part upon a state of a target decoder to which the encoded set of data is to be provided. In some embodiments, a set of data is encoded based at least in part upon the states of multiple decoders to which the encoded set of data is to be provided.Type: ApplicationFiled: September 5, 2014Publication date: January 1, 2015Inventors: Jim Normile, Thomas Pun, Xiaojin Shi, Xin Tong, Hsi-Jung Wu
-
Patent number: 8855213Abstract: Embodiments of the present invention provides a method and device for processing a source video. The method and device may provide computing an artifact estimation from differences among pixels selected from spatially-distributed sampling patterns in the source video; filtering the source video to produce a filtered version of the source video, computing a blending factor based on the artifact estimation in the source video, and computing an output video by blending the source video and the filtered version of the source video based on the blending factor.Type: GrantFiled: May 11, 2009Date of Patent: October 7, 2014Assignee: Apple Inc.Inventors: Gianluca Filippini, Xiaosong Zhou, Hsi-Jung Wu, James Oliver Normile, Xiaojin Shi, Ionut Hristodorescu
-
Patent number: 8830092Abstract: Techniques for encoding data based at least in part upon an awareness of the decoding complexity of the encoded data and the ability of a target decoder to decode the encoded data are disclosed. In some embodiments, a set of data is encoded based at least in part upon a state of a target decoder to which the encoded set of data is to be provided. In some embodiments, a set of data is encoded based at least in part upon the states of multiple decoders to which the encoded set of data is to be provided.Type: GrantFiled: June 9, 2011Date of Patent: September 9, 2014Assignee: Apple Inc.Inventors: Jim Normile, Thomas Pun, Xiaojin Shi, Xin Tong, Hsi-Jung Wu
-
Patent number: 8780986Abstract: Apparatuses and methods for improving coding processes and coding parameters for coding video data are provided for. A coder may select coding parameters for video data according to a default coding policy. The default coding policy may include selection of prediction modes (e.g., intra-coding or inter-coding) for each pixel group in each frame. A video coder may select some pixel groups in a frame to be coded as refresh pixel groups as an exception to the default assignment policies. The selection of refresh pixel groups may be based on prediction relationships among multiple frames of source video data. The default coding of the refresh pixel groups is then modified to enhanced the coding of the refresh pixel groups. The refresh pixel groups may permit fewer intra (I) frames be sent and/or may improve the quality of the recovered video.Type: GrantFiled: March 31, 2009Date of Patent: July 15, 2014Assignee: Apple Inc.Inventors: Hsi-Jung Wu, Xiaosong Zhou, Xiaojin Shi, Yuxin Liu
-
Publication number: 20140153653Abstract: Disclosed is a system and method of controlling a video decoder, including a reviewing channel data representing coded video data generated by an encoder to identify parameters of a hypothetical reference decoder (HRD) used by the encoder during coding operations. A parameter representing an exit data rate requirement of a coded picture buffer (CPB) of the HRD is compared against exit rate performance of the video decoder. If the exit rate performance of the video coder matches the exit rate requirement of the HRD, the coded video data is decoded, otherwise, a certain decoding degradation scheme can be applied, including disabling decoder from decoding the coded video data.Type: ApplicationFiled: February 7, 2014Publication date: June 5, 2014Applicant: APPLE INC.Inventors: Hsi-Jung Wu, Barin Geoffry Haskell, Xiaojin Shi, James Oliver Normile
-
Patent number: 8675740Abstract: Disclosed is a system and method of controlling a video decoder, including a reviewing channel data representing coded video data generated by an encoder to identify parameters of a hypothetical reference decoder (HRD) used by the encoder during coding operations. A parameter representing an exit data rate requirement of a coded picture buffer (CPB) of the HRD is compared against exit rate performance of the video decoder. If the exit rate performance of the video coder matches the exit rate requirement of the HRD, the coded video data is decoded, otherwise, a certain decoding degradation scheme can be applied, including disabling decoder from decoding the coded video data.Type: GrantFiled: December 31, 2012Date of Patent: March 18, 2014Assignee: Apple Inc.Inventors: Hsi-Jung Wu, Barin Geoffry Haskell, Xiaojin Shi, James Oliver Normile
-
Patent number: 8638851Abstract: A video coding system and method for increasing a transmitted output bit rate of a video encoding system by altering the content of the bit stream. A video encoder may receive a coding mode signal from a computer application for coding source video data, the coding mode signal indicating a target bit rate having a risk factor related to transmission error associated to the target bit rate. The coded bitstream may be modified based on the risk factor indicated in the coding mode signal. A modified coded bitstream may be outputted at the target bit rate and at a reduced coding efficiency, and the channel may be tested for transmission errors. Based on the test results, a revised coding mode signal indicating the same target bit rate, but a revised risk factor may be provided. The coded bitstream may be revised by removing the modifications previously made to the coded bitstream and a revised coded bitstream having greater coding efficiency may be output at the target bit rate.Type: GrantFiled: December 23, 2009Date of Patent: January 28, 2014Assignee: Apple Inc.Inventors: Hyeonkuk Jeong, Xiaosong Zhou, Joe Abuan, Xiaojin Shi, Hsi-Jung Wu, James Oliver Normile
-
Publication number: 20130329809Abstract: An error recovery method may be engaged by an encoder to recover from misalignment between reference picture caches at the encoder and decoder. When a communication error is detected between a coder and a decoder, a number of non-acknowledged reference frames present in the decoder's reference picture cache may be estimated. Thereafter, frames may be coded as reference frames in a number greater or equal to the number of non-acknowledged reference frames that are estimated to be present in the decoder's reference picture cache. Thereafter, ordinary coding operations may resume. Typically, a final reference frame that is coded in the error recovery mode will be coded as a synchronization frame that has high coding quality. The coded reference frames that precede it may be coded at low quality (or may be coded as SKIP-coded frames).Type: ApplicationFiled: September 28, 2012Publication date: December 12, 2013Applicant: APPLE INC.Inventors: Athanasios Leontaris, Haitao Guo, Xiaojin Shi
-
Patent number: 8599238Abstract: Methods, systems, and apparatus are presented for reducing distortion in an image, such as a video image. A video image can be captured by an image capture device, e.g. during a video conferencing session. Distortion correction processing, such as the application of one or more warping techniques, can be applied to the captured image to produce a distortion corrected image, which can be transmitted to one or more participants. The warping techniques can be performed in accordance with one or more warp parameters specifying a transformation of the captured image. Further, the warp parameters can be generated in accordance with an orientation of the image capture device, which can be determined based on sensor data or can be a fixed value. Additionally or alternatively, the warp parameters can be determined in accordance with a reference image or model to which the captured image should be warped.Type: GrantFiled: October 16, 2009Date of Patent: December 3, 2013Assignee: Apple Inc.Inventors: Hsi-Jung Wu, Chris Yoochang Chung, Xiaojin Shi, James Normile
-
Patent number: 8558903Abstract: Embodiments of the present invention provide a control system for video processes that selectively control the operation of motion stabilization processes. According to the present invention, motion sensor data indicative of motion of a mobile device may be received and processed. A determination may be made by comparing processed motion sensor data to a threshold. Based on the determination, motion stabilization may be suspended on select portions of a captured video sequence.Type: GrantFiled: April 7, 2010Date of Patent: October 15, 2013Assignee: Apple Inc.Inventors: Yuxin Liu, Xiaojin Shi, James Oliver Normile, Hsi-Jung Wu
-
Patent number: 8345774Abstract: Disclosed is a system and method of controlling a video decoder, including a reviewing channel data representing coded video data generated by an encoder to identify parameters of a hypothetical reference decoder (HRD) used by the encoder during coding operations. A parameter representing an exit data rate requirement of a coded picture buffer (CPB) of the HRD is compared against exit rate performance of the video decoder. If the exit rate performance of the video coder matches the exit rate requirement of the HRD, the coded video data is decoded, otherwise, a certain decoding degradation scheme can be applied, including disabling decoder from decoding the coded video data.Type: GrantFiled: January 11, 2008Date of Patent: January 1, 2013Assignee: Apple Inc.Inventors: Hsi-Jung Wu, Barin Geoffry Haskell, Xiaojin Shi, James Oliver Normile