Patents by Inventor Xuehan Xiong

Xuehan Xiong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250054306
    Abstract: Aspects of the disclosure are directed to methods and systems for short form previews of long form media items. A server can provide, to an artificial intelligence (AI) model, a long form media item to be shared with users. The server can receive, from the AI model, one or more frames that are predicted to contain content that is of interest to the users. The server can extract a segment of the long form media item that corresponds to the one or more frames, where the extracted segment corresponds to a short form media item preview. The short form media item preview can be provided for presentation to the users.
    Type: Application
    Filed: August 7, 2024
    Publication date: February 13, 2025
    Inventors: Daniel S. Cohen, Christopher R. Conover, Emily Rose Smith, Anoop Menon, Benjamin Lehn, Sudheendra Vijayanarasimhan, Bo Hu, Shen Yan, Xuehan Xiong, David Alexander Ross
  • Patent number: 12169765
    Abstract: A machine learning scheme can be trained on a set of labeled training images of a subject in different poses, with different textures, and with different background environments. The label or marker data of the subject may be stored as metadata to a 3D model of the subject or rendered images of the subject. The machine learning scheme may be implemented as a supervised learning scheme that can automatically identify the labeled data to create a classification model. The classification model can classify a depicted subject in many different environments and arrangements (e.g., poses).
    Type: Grant
    Filed: September 8, 2023
    Date of Patent: December 17, 2024
    Assignee: Snap Inc.
    Inventors: Xuehan Xiong, Zehao Xue
  • Patent number: 12159215
    Abstract: A modulated segmentation system can use a modulator network to emphasize spatial prior data of an object to track the object across multiple images. The modulated segmentation system can use a segmentation network that receives spatial prior data as intermediate data that improves segmentation accuracy. The segmentation network can further receive visual guide information from a visual guide network to increase tracking accuracy via segmentation.
    Type: Grant
    Filed: October 18, 2023
    Date of Patent: December 3, 2024
    Assignee: Snap Inc.
    Inventors: Linjie Yang, Jianchao Yang, Xuehan Xiong, Yanran Wang
  • Publication number: 20240371164
    Abstract: Methods and systems for video localization using artificial intelligence are provided herein. A set of video embeddings representing features of one or more video frames of a media it em and a set of textual embeddings corresponding to an event associated with the media item are obtained. Fused video-textual data is generated based on the set of video embeddings and the set of textual embeddings. The fused video-textual data indicates features of the video frames of the media item and textual data pertaining to the media item. The fused video-textual data is provided as an input to an artificial intelligence (AI) model trained to perform multiple video localization tasks with respect to media items of a platform. One or move outputs of the AI model are obtained. A segment of the media item that depicts the event is determined based on the one or move outputs of the AI model.
    Type: Application
    Filed: May 1, 2024
    Publication date: November 7, 2024
    Inventors: Shen Yan, Xuehan Xiong, Arsha Nagrani, Anurag Arnab, David Alexander Ross, Cordelia Schmid
  • Publication number: 20240372963
    Abstract: A machine learning system can generate an image mask (e.g., a pixel mask) comprising pixel assignments for pixels. The pixels can be assigned to classes, including, for example, face, clothes, body skin, or hair. The machine learning system can be implemented using a convolutional neural network that is configured to execute efficiently on computing devices having limited resources, such as mobile phones. The pixel mask can be used to more accurately display video effects interacting with a user or subject depicted in the image.
    Type: Application
    Filed: July 15, 2024
    Publication date: November 7, 2024
    Inventors: Lidiia Bogdanovych, William Brendel, Samuel Edward Hare, Fedir Paliakov, Guohui Wang, Xuehan Xiong, Jianchao Yang, Linjie Yang
  • Publication number: 20240346824
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing action localization on an input video. In particular, a system maintains a set of query vectors and uses the input video and the set of query vectors to generate an action localization output for the input video. The action localization output includes, for each of one or more agents depicted in the video, data specifying, for each of one or more video frames in the video, a respective bounding box in the video frame that depicts the agent and a respective action from a set of actions that is being performed by the agent in the video frame.
    Type: Application
    Filed: April 12, 2024
    Publication date: October 17, 2024
    Inventors: Alexey Alexeevich Gritsenko, Xuehan Xiong, Josip Djolonga, Mostafa Dehghani, Chen Sun, Mario Lucic, Cordelia Luise Schmid, Anurag Arnab
  • Patent number: 12075190
    Abstract: A machine learning system can generate an image mask (e.g., a pixel mask) comprising pixel assignments for pixels. The pixels can be assigned to classes, including, for example, face, clothes, body skin, or hair. The machine learning system can be implemented using a convolutional neural network that is configured to execute efficiently on computing devices having limited resources, such as mobile phones. The pixel mask can be used to more accurately display video effects interacting with a user or subject depicted in the image.
    Type: Grant
    Filed: July 13, 2023
    Date of Patent: August 27, 2024
    Assignee: Snap Inc.
    Inventors: Lidiia Bogdanovych, William Brendel, Samuel Edward Hare, Fedir Poliakov, Guohui Wang, Xuehan Xiong, Jianchao Yang, Linjie Yang
  • Publication number: 20240249522
    Abstract: A mobile device can generate real-time complex visual image effects using asynchronous processing pipeline. A first pipeline applies a complex image process, such as a neural network, to keyframes of a live image sequence. A second pipeline generates flow maps that describe feature transformations in the image sequence. The flow maps can be used to process non-keyframes on the fly. The processed keyframes and non-keyframes can be used to display a complex visual effect on the mobile device in real-time or near real-time.
    Type: Application
    Filed: April 2, 2024
    Publication date: July 25, 2024
    Inventors: Samuel Edward Hare, Fedir Poliakov, Guohui Wang, Xuehan Xiong, Jianchao Yang, Linjie Yang, Shah Tanmay Anilkumar
  • Patent number: 11989938
    Abstract: A mobile device can generate real-time complex visual image effects using asynchronous processing pipeline. A first pipeline applies a complex image process, such as a neural network, to keyframes of a live image sequence. A second pipeline generates flow maps that describe feature transformations in the image sequence. The flow maps can be used to process non-keyframes on the fly. The processed keyframes and non-keyframes can be used to display a complex visual effect on the mobile device in real-time or near real-time.
    Type: Grant
    Filed: May 4, 2023
    Date of Patent: May 21, 2024
    Assignee: Snap Inc.
    Inventors: Samuel Edward Hare, Fedir Poliakov, Guohui Wang, Xuehan Xiong, Jianchao Yang, Linjie Yang, Shah Tanmay Anilkumar
  • Publication number: 20240046072
    Abstract: A modulated segmentation system can use a modulator network to emphasize spatial prior data of an object to track the object across multiple images. The modulated segmentation system can use a segmentation network that receives spatial prior data as intermediate data that improves segmentation accuracy. The segmentation network can further receive visual guide information from a visual guide network to increase tracking accuracy via segmentation.
    Type: Application
    Filed: October 18, 2023
    Publication date: February 8, 2024
    Inventors: Linjie Yang, Jianchao Yang, Xuehan Xiong, Yanran Wang
  • Publication number: 20230419538
    Abstract: A method includes receiving video data that includes a series of frames of image data. Here, the video data is representative of an actor performing an activity. The method also includes processing the video data to generate a spatial input stream including a series of spatial images representative of spatial features of the actor performing the activity, a temporal input stream representative of motion of the actor performing the activity, and a pose input stream including a series of images representative of a pose of the actor performing the activity. Using at least one neural network, the method also includes processing the temporal input stream, the spatial input stream, and the pose input stream. The method also includes classifying, by the at least one neural network, the activity based on the temporal input stream, the spatial input stream, and the pose input stream.
    Type: Application
    Filed: September 11, 2023
    Publication date: December 28, 2023
    Applicant: Google LLC
    Inventors: Yinxiao Li, Zhichao Lu, Xuehan Xiong, Jonathan Huang
  • Publication number: 20230419188
    Abstract: A machine learning scheme can be trained on a set of labeled training images of a subject in different poses, with different textures, and with different background environments. The label or marker data of the subject may be stored as metadata to a 3D model of the subject or rendered images of the subject. The machine learning scheme may be implemented as a supervised learning scheme that can automatically identify the labeled data to create a classification model. The classification model can classify a depicted subject in many different environments and arrangements (e.g., poses).
    Type: Application
    Filed: September 8, 2023
    Publication date: December 28, 2023
    Inventors: Xuehan Xiong, Zehao Xue
  • Patent number: 11847528
    Abstract: A modulated segmentation system can use a modulator network to emphasize spatial prior data of an object to track the object across multiple images. The modulated segmentation system can use a segmentation network that receives spatial prior data as intermediate data that improves segmentation accuracy. The segmentation network can further receive visual guide information from a visual guide network to increase tracking accuracy via segmentation.
    Type: Grant
    Filed: December 29, 2022
    Date of Patent: December 19, 2023
    Assignee: Snap Inc.
    Inventors: Linjie Yang, Jianchao Yang, Xuehan Xiong, Yanran Wang
  • Publication number: 20230362331
    Abstract: A machine learning system can generate an image mask (e.g., a pixel mask) comprising pixel assignments for pixels. The pixels can be assigned to classes, including, for example, face, clothes, body skin, or hair. The machine learning system can be implemented using a convolutional neural network that is configured to execute efficiently on computing devices having limited resources, such as mobile phones. The pixel mask can be used to more accurately display video effects interacting with a user or subject depicted in the image.
    Type: Application
    Filed: July 13, 2023
    Publication date: November 9, 2023
    Inventors: Lidiia Bogdanovych, William Brendel, Samuel Edward Hare, Fedir Poliakov, Guohui Wang, Xuehan Xiong, Jianchao Yang, Linjie Yang
  • Patent number: 11790276
    Abstract: A machine learning scheme can be trained on a set of labeled training images of a subject in different poses, with different textures, and with different background environments. The label or marker data of the subject may be stored as metadata to a 3D model of the subject or rendered images of the subject. The machine learning scheme may be implemented as a supervised learning scheme that can automatically identify the labeled data to create a classification model. The classification model can classify a depicted subject in many different environments and arrangements (e.g., poses).
    Type: Grant
    Filed: May 17, 2021
    Date of Patent: October 17, 2023
    Assignee: Snap Inc.
    Inventors: Xuehan Xiong, Zehao Xue
  • Patent number: 11776156
    Abstract: A method includes receiving video data that includes a series of frames of image data. Here, the video data is representative of an actor performing an activity. The method also includes processing the video data to generate a spatial input stream including a series of spatial images representative of spatial features of the actor performing the activity, a temporal input stream representative of motion of the actor performing the activity, and a pose input stream including a series of images representative of a pose of the actor performing the activity. Using at least one neural network, the method also includes processing the temporal input stream, the spatial input stream, and the pose input stream. The method also includes classifying, by the at least one neural network, the activity based on the temporal input stream, the spatial input stream, and the pose input stream.
    Type: Grant
    Filed: June 11, 2021
    Date of Patent: October 3, 2023
    Assignee: Google LLC
    Inventors: Yinxiao Li, Zhichao Lu, Xuehan Xiong, Jonathan Huang
  • Publication number: 20230274543
    Abstract: A mobile device can generate real-time complex visual image effects using asynchronous processing pipeline. A first pipeline applies a complex image process, such as a neural network, to keyframes of a live image sequence. A second pipeline generates flow maps that describe feature transformations in the image sequence. The flow maps can be used to process non-keyframes on the fly. The processed keyframes and non-keyframes can be used to display a complex visual effect on the mobile device in real-time or near real-time.
    Type: Application
    Filed: May 4, 2023
    Publication date: August 31, 2023
    Inventors: Samuel Edward Hare, Fedir Poliakov, Guohui Wang, Xuehan Xiong, Jianchao Yang, Linjie Yang, Shah Tanmay Anilkumar
  • Patent number: 11743426
    Abstract: A machine learning system can generate an image mask (e.g., a pixel mask) comprising pixel assignments for pixels. The pixels can he assigned to classes, including, for example, face, clothes, body skin, or hair. The machine learning system can be implemented. using a convolutional neural network that is configured to execute efficiently on computing devices having limited resources, such as mobile phones. The pixel mask can be used to more accurately display video effects interacting with a user or subject depicted in the image.
    Type: Grant
    Filed: August 13, 2020
    Date of Patent: August 29, 2023
    Assignee: Snap Inc.
    Inventors: Lidiia Bogdanovych, William Brendel, Samuel Edward Hare, Fedir Poliakov, Guohui Wang, Xuehan Xiong, Jianchao Yang, Linjie Yang
  • Patent number: 11676381
    Abstract: A mobile device can generate real-time complex visual image effects using asynchronous processing pipeline. A first pipeline applies a complex image process, such as a neural network, to keyframes of a live image sequence. A second pipeline generates flow maps that describe feature transformations in the image sequence. The flow maps can be used to process non-keyframes on the fly. The processed keyframes and non-keyframes can be used to display a complex visual effect on the mobile device in real-time or near real-time.
    Type: Grant
    Filed: January 22, 2021
    Date of Patent: June 13, 2023
    Assignee: Snap Inc.
    Inventors: Samuel Edward Hare, Fedir Poliakov, Guohui Wang, Xuehan Xiong, Jianchao Yang, Linjie Yang, Shah Tanmay Anilkumar
  • Patent number: 11645843
    Abstract: A mobile device can generate real-time complex visual image effects using asynchronous processing pipeline. A first pipeline applies a complex image process, such as a neural network, to keyframes of a live image sequence. A second pipeline generates flow maps that describe feature transformations in the image sequence. The flow maps can be used to process non-keyframes on the fly. The processed keyframes and non-keyframes can be used to display a complex visual effect on the mobile device in real-time or near real-time.
    Type: Grant
    Filed: January 22, 2021
    Date of Patent: May 9, 2023
    Assignee: Snap Inc.
    Inventors: Samuel Edward Hare, Fedir Poliakov, Guohui Wang, Xuehan Xiong, Jianchao Yang, Linjie Yang, Shah Tanmay Anilkumar