Patents by Inventor Xiaosong Zhou

Xiaosong Zhou has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Automated media editing operations in consumer devices

Patent number: 11605224

Abstract: Techniques disclosed for managing video captured by an imaging device. Methods disclosed capture a video in response to a capture command received at the imaging device. Following a video capture, techniques for classifying the captured video based on feature(s) extracted therefrom, for marking the captured video based on the classification, and for generating a media item from the captured video according to the marking are disclosed. Accordingly, the captured video may be classified as representing a static event, and, as a result, a media item of a still image may be generated. Otherwise, the captured video may be classified as representing a dynamic event, and, as a result, a media item of a video may be generated.

Type: Grant

Filed: May 26, 2020

Date of Patent: March 14, 2023

Assignee: APPLE INC.

Inventors: Bartlomiej Rymkowski, Robert Bailey, Ethan Tira-Thompson, Shuang Gao, Ben Englert, Emilie Kim, Shujie Liu, Ke Zhang, Vinay Sharma, Xiaosong Zhou
Immersive video streaming using view-adaptive prefetching and buffer control

Patent number: 11570417

Abstract: A system obtains a data set representing immersive video content for display at a display time, including first data representing the content according to a first level of detail, and second data representing the content according to a second higher level of detail. During one or more first times prior to the display time, the system causes at least a portion of the first data to be stored in a buffer. During one or more second times prior to the display time, the system generates a prediction of a viewport for displaying the content to a user at the display time, identifies a portion of the second data corresponding to the prediction of the viewport, and causes the identified portion of the second data to be stored in the video buffer. At the display time, the system causes the content to be displayed to the user using the video buffer.

Type: Grant

Filed: May 20, 2021

Date of Patent: January 31, 2023

Assignee: Apple Inc.

Inventors: Fanyi Duanmu, Jun Xin, Hsi-Jung Wu, Xiaosong Zhou
Applications for decoder-side modeling of objects identified in decoded video data

Patent number: 11553200

Abstract: Techniques are disclosed for coding and decoding video data using object recognition and object modeling as a basis of coding and error recovery. A video decoder may decode coded video data received from a channel. The video decoder may perform object recognition on decoded video data obtained therefrom, and, when an object is recognized in the decoded video data, the video decoder may generate a model representing the recognized object. It may store data representing the model locally. The video decoder may communicate the model data to an encoder, which may form a basis of error mitigation and recovery. The video decoder also may monitor deviation patterns in the object model and associated patterns in audio content; if/when video decoding is suspended due to operational errors, the video decoder may generate simulated video data by analyzing audio data received during the suspension period and developing video data from the data model and deviation(s) associated with patterns detected from the audio data.

Type: Grant

Filed: May 11, 2020

Date of Patent: January 10, 2023

Assignee: APPLE INC.

Inventors: Xing Wen, Dazhong Zhang, Peikang Song, Xiaosong Zhou, Sudeng Hu, Hsi-Jung Wu, Jae Hoon Kim
Immersive Video Streaming Using View-Adaptive Prefetching and Buffer Control

Publication number: 20220377304

Abstract: A system obtains a data set representing immersive video content for display at a display time, including first data representing the content according to a first level of detail, and second data representing the content according to a second higher level of detail. During one or more first times prior to the display time, the system causes at least a portion of the first data to be stored in a buffer. During one or more second times prior to the display time, the system generates a prediction of a viewport for displaying the content to a user at the display time, identifies a portion of the second data corresponding to the prediction of the viewport, and causes the identified portion of the second data to be stored in the video buffer. At the display time, the system causes the content to be displayed to the user using the video buffer.

Type: Application

Filed: May 20, 2021

Publication date: November 24, 2022

Inventors: Fanyi Duanmu, Jun Xin, Hsi-Jung Wu, Xiaosong Zhou
SYSTEMS AND METHODS FOR PERSPECTIVE SHIFTING IN VIDEO CONFERENCING SESSION

Publication number: 20220329756

Abstract: Embodiments of the present disclosure provide systems and methods for perspective shifting in a video conferencing session. In one exemplary method, a video stream may be generated. A foreground element may be identified in a frame of the video stream and distinguished from a background element of the frame. Data may be received representing a viewing condition at a terminal that will display the generated video stream. The frame of the video stream may be modified based on the received data to shift of the foreground element relative to the background element. The modified video stream may be displayed at the displaying terminal.

Type: Application

Filed: June 22, 2022

Publication date: October 13, 2022

Inventors: Jae Hoon Kim, Chris Y. Chung, Dazhong Zhang, Hang Yuan, Hsi-Jung Wu, Xiaosong Zhou, Jiefu Zhai
Instant Video Communication Connections

Publication number: 20220286644

Abstract: Computing devices may implement instant video communication connections for video communications. Connection information for mobile computing devices may be maintained. A request to initiate an instant video communication may be received, and if authorized, the connection information for the particular recipient mobile computing device may be accessed. Video communication data may then be sent to the recipient mobile computing device according to the connection information so that the video communication data may be displayed at the recipient device as it is received. New connection information for different mobile computing devices may be added, or updates to existing connection information may also be performed. Connection information for some mobile computing devices may be removed.

Type: Application

Filed: May 26, 2022

Publication date: September 8, 2022

Inventors: Xiaosong Zhou, Hsi-Jung Wu, Chris Y. Chung, James Normile, Joe S. Abuan, Hyeonkuk Jeong, Yan Yang, Gobind Johar, Thomas Christopher Jansen
Systems and methods for perspective shifting in video conferencing session

Patent number: 11394921

Abstract: Embodiments of the present disclosure provide systems and methods for perspective shifting in a video conferencing session. In one exemplary method, a video stream may be generated. A foreground element may be identified in a frame of the video stream and distinguished from a background element of the frame. Data may be received representing a viewing condition at a terminal that will display the generated video stream. The frame of the video stream may be modified based on the received data to shift of the foreground element relative to the background element. The modified video stream may be displayed at the displaying terminal.

Type: Grant

Filed: March 10, 2017

Date of Patent: July 19, 2022

Assignee: Apple Inc.

Inventors: Jae Hoon Kim, Chris Y. Chung, Dazhong Zhang, Hang Yuan, Hsi-Jung Wu, Xiaosong Zhou, Jiefu Zhai
NEURAL NETWORK BASED RESIDUAL CODING AND PREDICTION FOR PREDICTIVE CODING

Publication number: 20220191473

Abstract: Systems and methods disclosed for video compression, utilizing neural networks for predictive video coding. Processes employed combine multiple banks of neural networks with codec system components to carry out the coding and decoding of video data.

Type: Application

Filed: January 4, 2022

Publication date: June 16, 2022

Inventors: Jiefu ZHAI, Xingyu ZHANG, Xiaosong ZHOU, Jun XIN, Hsi-Jung WU, Yeping SU
Real-time face and object manipulation

Patent number: 11282543

Abstract: Techniques are presented for modifying images of an object in video, for example to correct for lens distortion, or to beautify a face. These techniques include extracting and validating features of an object from a source video frame, tracking those features over time, estimating a pose of the object, modifying a 3D model of the object based on the features, and rendering a modified video frame based on the modified 3D model and modified intrinsic and extrinsic matrices. These techniques may be applied in real-time to an object in a sequence of video frames.

Type: Grant

Filed: March 9, 2018

Date of Patent: March 22, 2022

Assignee: Apple Inc.

Inventors: Hang Yuan, Jiefu Zhai, Ming Chen, Jae Hoon Kim, Dazhong Zhang, Xiaosong Zhou, Chris Y. Chung, Hsi-Jung Wu
Processing of equirectangular object data to compensate for distortion by spherical projections

Patent number: 11259046

Abstract: Methods and Systems disclosed to counteract spatial distortions introduced by imaging processes of multi-directional video frames, where objects may be projected to spherical or equirectangular representations. Techniques provided to invert the spatial distortions in video frames used as reference picture data in predictive coding, by spatially transforming the image content of the reference picture data before this image content is being used for the prediction of input video data in prediction-based coders and decoders.

Type: Grant

Filed: February 15, 2017

Date of Patent: February 22, 2022

Assignee: Apple Inc.

Inventors: Jae Hoon Kim, Chris Y. Chung, Dazhong Zhang, Hang Yuan, Hsi-Jung Wu, Jiefu Zhai, Xiaosong Zhou
Neural network based residual coding and prediction for predictive coding

Patent number: 11240492

Abstract: Systems and methods disclosed for video compression, utilizing neural networks for predictive video coding. Processes employed combine multiple banks of neural networks with codec system components to carry out the coding and decoding of video data.

Type: Grant

Filed: January 22, 2019

Date of Patent: February 1, 2022

Assignee: Apple Inc.

Inventors: Jiefu Zhai, Xingyu Zhang, Xiaosong Zhou, Jun Xin, Hsi-Jung Wu, Yeping Su
OBJECT AND KEYPOINT DETECTION SYSTEM WITH LOW SPATIAL JITTER, LOW LATENCY AND LOW POWER USAGE

Publication number: 20210397826

Abstract: Video object and keypoint location detection techniques are presented. The system includes a detection system for generation locations of an object's keypoints along with probabilities associated with the locations, and a stability system for stabilizing keypoint locations of the detected objects. In some aspects, the generated probabilities are two-dimensional array correspond locations within input images, and stability system fits the generated probabilities to a two-dimensional probability distribution function.

Type: Application

Filed: June 4, 2021

Publication date: December 23, 2021

Inventors: Xiaoxia SUN, Jiefu ZHAI, Ke ZHANG, Xiaosong ZHOU, Hsi-Jung WU
Techniques to overcome communication lag between terminals performing video mirroring and annotation operations

Patent number: 11206371

Abstract: Techniques are disclosed for overcoming communication lag between interactive operations among devices in a streaming session. According to the techniques, a first device streaming video content to a second device and an annotation is entered to a first frame being displayed at the second device, which is communicated back to the first device. Responsive to a communication that identifies the annotation, a first device may identify an element of video content from the first frame to which the annotation applies and determine whether the identified element is present in a second frame of video content currently displayed at the first terminal. If so, the first device may display the annotation with the second frame in a location where the identified element is present. If not, the first device may display the annotation via an alternate technique.

Type: Grant

Filed: April 24, 2017

Date of Patent: December 21, 2021

Assignee: Apple Inc.

Inventors: Chris Y. Chung, Dazhong Zhang, Hsi-Jung Wu, Xiaosong Zhou
Media feed prioritization for multi-party conferencing

Patent number: 11184415

Abstract: Techniques presented herein provide an improved relay user experience and improved management of scarce computing and network resources as the number of relay endpoints increases. A sourcing endpoint device may generate a media feed, such as video and/or audio feed, representing contribution from a conference participant. The sourcing endpoint device may generate a priority value for the media feed, and the priority value may be transmitted to other members of the relay along with the input feed. Priority values of the different relay participants may be used by other devices, for example, intermediate servers or receiving endpoint devices, to manage aspects of the relay. For example, a relay server may prune streams from select endpoint devices based on relative priority values received from those devices. Alternatively, receiving endpoint devices may alter presentation of received feeds based on their associated priority values.

Type: Grant

Filed: May 7, 2019

Date of Patent: November 23, 2021

Assignee: Apple Inc.

Inventors: Christopher M. Garrido, Dazhong Zhang, Karthick Santhanam, Patrick Miauton, Xiaoxiao Zheng, Bess Chan, Peter Shiang, Sudeng Hu, Peikang Song, Xiaosong Zhou
ESTABLISHING A VIDEO CONFERENCE DURING A PHONE CALL

Publication number: 20210360192

Abstract: Some embodiments provide a method for initiating a video conference using a first mobile device. The method presents, during an audio call through a wireless communication network with a second device, a selectable user-interface (UI) item on the first mobile device for switching from the audio call to the video conference. The method receives a selection of the selectable UI item. The method initiates the video conference without terminating the audio call. The method terminates the audio call before allowing the first and second devices to present audio and video data exchanged through the video conference.

Type: Application

Filed: May 27, 2021

Publication date: November 18, 2021

Inventors: Elizabeth C. Cranfill, Stephen O. Lemay, Joe S. Abuan, Hsi-Jung Wu, Xiaosong Zhou, Roberto Garcia, JR.
Gesture and prominence in video conferencing

Patent number: 11165989

Abstract: Techniques are presented for managing for visual prominence of participants in a video conference, including conferences where participants communicate visually, such as with sign language. According to these techniques, a visual prominence indication of a participant in a video conference may be estimated, a video stream of the participant may be encoded, and the encoded video stream may be transmitted along with an indication of the estimated visual prominence to a receiving device in the video conference.

Type: Grant

Filed: November 20, 2019

Date of Patent: November 2, 2021

Assignee: Apple Inc.

Inventors: Johnny Trenh, Hsi-Jung Wu, Sarah K. Herrlinger, Xiaoxia Sun, Ian J. Baird, Dazhong Zhang, Xiaosong Zhou, Christopher M. Garrido
SPHERE PROJECTED MOTION ESTIMATION/COMPENSATION AND MODE DECISION

Publication number: 20210321133

Abstract: Techniques are disclosed for coding video data predictively based on predictions made from spherical-domain projections of input pictures to be coded and reference pictures that are prediction candidates. Spherical projection of an input picture and the candidate reference pictures may be generated. Thereafter, a search may be conducted for a match between the spherical-domain representation of a pixel block to be coded and a spherical-domain representation of the reference picture. On a match, an offset may be determined between the spherical-domain representation of the pixel block to a matching portion of the of the reference picture in the spherical-domain representation. The spherical-domain offset may be transformed to a motion vector in a source-domain representation of the input picture, and the pixel block may be coded predictively with reference to a source-domain representation of the matching portion of the reference picture.

Type: Application

Filed: March 19, 2021

Publication date: October 14, 2021

Inventors: Jae Hoon Kim, Xiaosong Zhou, Dazhong Zhang, Hang Yuan, Jiefu Zhai, Chris Y. Chung, Hsi-Jung Wu
Efficient coding of video data in the presence of video annotations

Patent number: 11109042

Abstract: Systems and methods for coding a video to be overlaid by annotations are devised. A motion compensated predictive coding is employed, wherein coding parameters of video pixel blocks are determined based on the pixel blocks' relation to the annotations. A decoder decodes the video and annotates it based on metadata, obtained from the coder or other sources, describing the annotations' appearance and rendering mode.

Type: Grant

Filed: May 23, 2019

Date of Patent: August 31, 2021

Assignee: Apple Inc.

Inventors: Sudeng Hu, Xing Wen, Jae Hoon Kim, Peikang Song, Hang Yuan, Dazhong Zhang, Xiaosong Zhou, Hsi-Jung Wu, Christopher Garrido, Ming Jin, Patrick Miauton, Karthick Santhanam
In loop chroma deblocking filter

Patent number: 11102515

Abstract: Chroma deblock filtering of reconstructed video samples may be performed to remove blockiness artifacts and reduce color artifacts without over-smoothing. In a first method, chroma deblocking may be performed for boundary samples of a smallest transform size, regardless of partitions and coding modes. In a second method, chroma deblocking may be performed when a boundary strength is greater than 0. In a third method, chroma deblocking may be performed regardless of boundary strengths. In a fourth method, the type of chroma deblocking to be performed may be signaled in a slice header by a flag. Furthermore, luma deblock filtering techniques may be applied to chroma deblock filtering.

Type: Grant

Filed: June 2, 2020

Date of Patent: August 24, 2021

Assignee: Apple Inc.

Inventors: Jiefu Zhai, Dazhong Zhang, Xiaosong Zhou, Chris Y. Chung, Hsi-Jung Wu, Peikang Song, David R. Conrad, Jae Hoon Kim, Yunfei Zheng
Object tracking in multi-view video

Patent number: 11093752

Abstract: Techniques are disclosed for managing display of content from multi-view video data. According to these techniques, an object may be identified from content of the multi-view video. The object's location may be tracked across a sequence of multi-view video. The technique may extract a sub-set of video that is contained within a view window that is shifted in an image space of the multi-view video in correspondence to the tracked object's location. These techniques may be implemented either in an image source device or an image sink device.

Type: Grant

Filed: June 2, 2017

Date of Patent: August 17, 2021

Assignee: Apple Inc.

Inventors: Jae Hoon Kim, Ming Chen, Hang Yuan, Jiefu Zhai, Dazhong Zhang, Xiaosong Zhou, Chris Chung, Hsi-Jung Wu

prev 1 2 3 4 5 6 … next