Patents by Inventor Xiaohui Shen

Xiaohui Shen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10319412
    Abstract: The present disclosure is directed toward systems and methods for tracking objects in videos. For example, one or more embodiments described herein utilize various tracking methods in combination with an image search index made up of still video frames indexed from a video. One or more embodiments described herein utilize a backward and forward tracking method that is anchored by one or more key frames in order to accurately track an object through the frames of a video, even when the video is long and may include challenging conditions.
    Type: Grant
    Filed: November 16, 2016
    Date of Patent: June 11, 2019
    Assignee: ADOBE INC.
    Inventors: Zhihong Ding, Zhe Lin, Xiaohui Shen, Michael Kaplan, Jonathan Brandt
  • Patent number: 10311574
    Abstract: A digital medium environment includes an image processing application that performs object segmentation on an input image. An improved object segmentation method implemented by the image processing application comprises receiving an input image that includes an object region to be segmented by a segmentation process, processing the input image to provide a first segmentation that defines the object region, and processing the first segmentation to provide a second segmentation that provides pixel-wise label assignments for the object region. In some implementations, the image processing application performs improved sky segmentation on an input image containing a depiction of a sky.
    Type: Grant
    Filed: December 22, 2017
    Date of Patent: June 4, 2019
    Assignee: Adobe Inc.
    Inventors: Xiaohui Shen, Zhe Lin, Yi-Hsuan Tsai, Kalyan K. Sunkavalli
  • Publication number: 20190164261
    Abstract: Systems and techniques for estimating illumination from a single image are provided. An example system may include a neural network. The neural network may include an encoder that is configured to encode an input image into an intermediate representation. The neural network may also include an intensity decoder that is configured to decode the intermediate representation into an output light intensity map. An example intensity decoder is generated by a multi-phase training process that includes a first phase to train a light mask decoder using a set of low dynamic range images and a second phase to adjust parameters of the light mask decoder using a set of high dynamic range image to generate the intensity decoder.
    Type: Application
    Filed: November 28, 2017
    Publication date: May 30, 2019
    Inventors: Kalyan Sunkavalli, Mehmet Ersin Yumer, Marc-Andre Gardner, Xiaohui Shen, Jonathan Eisenmann, Emiliano Gambaretto
  • Patent number: 10297881
    Abstract: Embodiments of the present disclosure provide a battery heating system, a battery assembly and an electric vehicle. The battery heating system includes: a battery group having a positive terminal and a negative terminal; a switch having a first end connected with the positive terminal; a large-current discharge module, and a controller connected to the switch and configured to control the switch according to a temperature of the battery group. A first end of the large-current discharge module is connected to a second end of the switch, and a second end of the large-current discharge module is connected to the negative terminal. When the switch is turned on, the battery group discharges via the large-current discharge module and the battery group is heated due to an internal resistance thereof.
    Type: Grant
    Filed: December 19, 2016
    Date of Patent: May 21, 2019
    Assignee: BYD COMPANY LIMITED
    Inventors: Xi Shen, Wenfeng Jiang, Jin Liu, Xiaohui Jia
  • Publication number: 20190147224
    Abstract: Approaches are described for determining facial landmarks in images. An input image is provided to at least one trained neural network that determines a face region (e.g., bounding box of a face) of the input image and initial facial landmark locations corresponding to the face region. The initial facial landmark locations are provided to a 3D face mapper that maps the initial facial landmark locations to a 3D face model. A set of facial landmark locations are determined from the 3D face model. The set of facial landmark locations are provided to a landmark location adjuster that adjusts positions of the set of facial landmark locations based on the input image. The input image is presented on a user device using the adjusted set of facial landmark locations.
    Type: Application
    Filed: November 16, 2017
    Publication date: May 16, 2019
    Inventors: HAOXIANG LI, ZHE LIN, JONATHAN BRANDT, XIAOHUI SHEN
  • Patent number: 10290112
    Abstract: Techniques for planar region-guided estimates of 3D geometry of objects depicted in a single 2D image. The techniques estimate regions of an image that are part of planar regions (i.e., flat surfaces) and use those planar region estimates to estimate the 3D geometry of the objects in the image. The planar regions and resulting 3D geometry are estimated using only a single 2D image of the objects. Training data from images of other objects is used to train a CNN with a model that is then used to make planar region estimates using a single 2D image. The planar region estimates, in one example, are based on estimates of planarity (surface plane information) and estimates of edges (depth discontinuities and edges between surface planes) that are estimated using models trained using images of other scenes.
    Type: Grant
    Filed: June 4, 2018
    Date of Patent: May 14, 2019
    Assignee: Adobe Inc.
    Inventors: Xiaohui Shen, Scott Cohen, Peng Wang, Bryan Russell, Brian Price, Jonathan Eisenmann
  • Publication number: 20190130229
    Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for segmenting objects in digital visual media utilizing one or more salient content neural networks. In particular, in one or more embodiments, the disclosed systems and methods train one or more salient content neural networks to efficiently identify foreground pixels in digital visual media. Moreover, in one or more embodiments, the disclosed systems and methods provide a trained salient content neural network to a mobile device, allowing the mobile device to directly select salient objects in digital visual media utilizing a trained neural network. Furthermore, in one or more embodiments, the disclosed systems and methods train and provide multiple salient content neural networks, such that mobile devices can identify objects in real-time digital visual media feeds (utilizing a first salient content neural network) and identify objects in static digital images (utilizing a second salient content neural network).
    Type: Application
    Filed: October 31, 2017
    Publication date: May 2, 2019
    Inventors: Xin Lu, Zhe Lin, Xiaohui Shen, Jimei Yang, Jianming Zhang, Jen-Chan Jeff Chien, Chenxi Liu
  • Publication number: 20190114818
    Abstract: Predicting patch displacement maps using a neural network is described. Initially, a digital image on which an image editing operation is to be performed is provided as input to a patch matcher having an offset prediction neural network. From this image and based on the image editing operation for which this network is trained, the offset prediction neural network generates an offset prediction formed as a displacement map, which has offset vectors that represent a displacement of pixels of the digital image to different locations for performing the image editing operation. Pixel values of the digital image are copied to the image pixels affected by the operation by: determining the vectors pixels that correspond to the image pixels affected by the image editing operation and mapping the pixel values of the image pixels represented by the determined offset vectors to the affected pixels.
    Type: Application
    Filed: October 16, 2017
    Publication date: April 18, 2019
    Applicant: Adobe Systems Incorporated
    Inventors: Zhe Lin, Xin Lu, Xiaohui Shen, Jimei Yang, Jiahui Yu
  • Publication number: 20190114748
    Abstract: Digital image completion using deep learning is described. Initially, a digital image having at least one hole is received. This holey digital image is provided as input to an image completer formed with a framework that combines generative and discriminative neural networks based on learning architecture of the generative adversarial networks. From the holey digital image, the generative neural network generates a filled digital image having hole-filling content in place of holes. The discriminative neural networks detect whether the filled digital image and the hole-filling digital content correspond to or include computer-generated content or are photo-realistic. The generating and detecting are iteratively continued until the discriminative neural networks fail to detect computer-generated content for the filled digital image and hole-filling content or until detection surpasses a threshold difficulty.
    Type: Application
    Filed: October 16, 2017
    Publication date: April 18, 2019
    Applicant: Adobe Systems Incorporated
    Inventors: Zhe Lin, Xin Lu, Xiaohui Shen, Jimei Yang, Jiahui Yu
  • Publication number: 20190108640
    Abstract: Various embodiments describe using a neural network to evaluate image crops in substantially real-time. In an example, a computer system performs unsupervised training of a first neural network based on unannotated image crops, followed by a supervised training of the first neural network based on annotated image crops. Once this first neural network is trained, the computer system inputs image crops generated from images to this trained network and receives composition scores therefrom. The computer system performs supervised training of a second neural network based on the images and the composition scores.
    Type: Application
    Filed: October 11, 2017
    Publication date: April 11, 2019
    Inventors: Jianming Zhang, Zijun Wei, Zhe Lin, Xiaohui Shen, Radomir Mech
  • Publication number: 20190109981
    Abstract: Various embodiments describe facilitating real-time crops on an image. In an example, an image processing application executed on a device receives image data corresponding to a field of view of a camera of the device. The image processing application renders a major view on a display of the device in a preview mode. The major view presents a previewed image based on the image data. The image processing application receives a composition score of a cropped image from a deep-learning system. The image processing application renders a sub-view presenting the cropped image based on the composition score in a preview mode. Based on a user interaction, the image processing application renders the cropped image in the major view with the sub-view in the preview mode.
    Type: Application
    Filed: October 11, 2017
    Publication date: April 11, 2019
    Inventors: Jianming Zhang, Zijun Wei, Zhe Lin, Xiaohui Shen, Radomir Mech
  • Publication number: 20190110002
    Abstract: Various embodiments describe view switching of video on a computing device. In an example, a video processing application executed on the computing device receives a stream of video data. The video processing application renders a major view on a display of the computing device. The major view presents a video from the stream of video data. The video processing application inputs the stream of video data to a deep learning system and receives back information that identifies a cropped video from the video based on a composition score of the cropped video, while the video is presented in the major view. The composition score is generated by the deep learning system. The video processing application renders a sub-view on a display of the device, the sub-view presenting the cropped video. The video processing application renders the cropped video in the major view based on a user interaction with the sub-view.
    Type: Application
    Filed: October 11, 2017
    Publication date: April 11, 2019
    Inventors: Jianming Zhang, Zijun Wei, Zhe Lin, Xiaohui Shen, Radomir Mech
  • Patent number: 10257436
    Abstract: Various embodiments describe view switching of video on a computing device. In an example, a video processing application receives a stream of video data. The video processing application renders a major view on a display of the computing device. The major view presents a video from the stream of video data. The video processing application inputs the stream of video data to a deep learning system and receives back information that identifies a cropped video from the video based on a composition score of the cropped video, while the video is presented in the major view. The composition score is generated by the deep learning system. The video processing application renders a sub-view on a display of the device, the sub-view presenting the cropped video. The video processing application renders the cropped video in the major view based on a user interaction with the sub-view.
    Type: Grant
    Filed: October 11, 2017
    Date of Patent: April 9, 2019
    Assignee: Adobe Systems Incorporated
    Inventors: Jianming Zhang, Zijun Wei, Zhe Lin, Xiaohui Shen, Radomir Mech
  • Patent number: 10235623
    Abstract: Embodiments of the present invention provide an automated image tagging system that can predict a set of tags, along with relevance scores, that can be used for keyword-based image retrieval, image tag proposal, and image tag auto-completion based on user input. Initially, during training, a clustering technique is utilized to reduce cluster imbalance in the data that is input into a convolutional neural network (CNN) for training feature data. In embodiments, the clustering technique can also be utilized to compute data point similarity that can be utilized for tag propagation (to tag untagged images). During testing, a diversity based voting framework is utilized to overcome user tagging biases. In some embodiments, bigram re-weighting can down-weight a keyword that is likely to be part of a bigram based on a predicted tag set.
    Type: Grant
    Filed: April 8, 2016
    Date of Patent: March 19, 2019
    Assignee: Adobe Inc.
    Inventors: Zhe Lin, Xiaohui Shen, Jonathan Brandt, Jianming Zhang, Chen Fang
  • Patent number: 10216766
    Abstract: A framework is provided for associating images with topics utilizing embedding learning. The framework is trained utilizing images, each having multiple visual characteristics and multiple keyword tags associated therewith. Visual features are computed from the visual characteristics utilizing a convolutional neural network and an image feature vector is generated therefrom. The keyword tags are utilized to generate a weighted word vector (or “soft topic feature vector”) for each image by calculating a weighted average of word vector representations that represent the keyword tags associated with the image. The image feature vector and the soft topic feature vector are aligned in a common embedding space and a relevancy score is computed for each of the keyword tags. Once trained, the framework can automatically tag images and a text-based search engine can rank image relevance with respect to queried keywords based upon predicted relevancy scores.
    Type: Grant
    Filed: March 20, 2017
    Date of Patent: February 26, 2019
    Assignee: ADOBE INC.
    Inventors: Zhe Lin, Xiaohui Shen, Jianming Zhang, Hailin Jin, Yingwei Li
  • Publication number: 20190035083
    Abstract: The invention is directed towards segmenting images based on natural language phrases. An image and an n-gram, including a sequence of tokens, are received. An encoding of image features and a sequence of token vectors are generated. A fully convolutional neural network identifies and encodes the image features. A word embedding model generates the token vectors. A recurrent neural network (RNN) iteratively updates a segmentation map based on combinations of the image feature encoding and the token vectors. The segmentation map identifies which pixels are included in an image region referenced by the n-gram. A segmented image is generated based on the segmentation map. The RNN may be a convolutional multimodal RNN. A separate RNN, such as a long short-term memory network, may iteratively update an encoding of semantic features based on the order of tokens. The first RNN may update the segmentation map based on the semantic feature encoding.
    Type: Application
    Filed: August 29, 2018
    Publication date: January 31, 2019
    Inventors: Zhe Lin, Xin Lu, Xiaohui Shen, Jimei Yang, Chenxi Liu
  • Publication number: 20190026609
    Abstract: Techniques and systems are described to determine personalized digital image aesthetics in a digital medium environment. In one example, a personalized offset is generated to adapt a generic model for digital image aesthetics. A generic model, once trained, is used to generate training aesthetics scores from a personal training data set that corresponds to an entity, e.g., a particular user, group of users, and so on. The image aesthetics system then generates residual scores (e.g., offsets) as a difference between the training aesthetics score and the personal aesthetics score for the personal training digital images. The image aesthetics system then employs machine learning to train a personalized model to predict the residual scores as a personalized offset using the residual scores and personal training digital images.
    Type: Application
    Filed: July 24, 2017
    Publication date: January 24, 2019
    Applicant: Adobe Systems Incorporated
    Inventors: Xiaohui Shen, Zhe Lin, Radomir Mech, Jian Ren
  • Publication number: 20180374199
    Abstract: Embodiments of the present disclosure relate to a sky editing system and related processes for sky editing. The sky editing system includes a composition detector to determine the composition of a target image. A sky search engine in the sky editing system is configured to find a reference image with similar composition with the target image. Subsequently, a sky editor replaces content of the sky in the target image with content of the sky in the reference image. As such, the sky editing system transforms the target image into a new image with a preferred sky background.
    Type: Application
    Filed: August 31, 2018
    Publication date: December 27, 2018
    Inventors: Xiaohui Shen, Yi-Hsuan Tsai, Kalyan K. Sunkavalli, Zhe Lin
  • Publication number: 20180357803
    Abstract: Embodiments of the present invention are directed to facilitating region of interest preservation. In accordance with some embodiments of the present invention, a region of interest preservation score using adaptive margins is determined. The region of interest preservation score indicates an extent to which at least one region of interest is preserved in a candidate image crop associated with an image. A region of interest positioning score is determined that indicates an extent to which a position of the at least one region of interest is preserved in the candidate image crop associated with the image. The region of interest preservation score and/or the preserving score are used to select a set of one or more candidate image crops as image crop suggestions.
    Type: Application
    Filed: June 12, 2017
    Publication date: December 13, 2018
    Inventors: Jianming Zhang, Zhe Lin, Radomir Mech, Xiaohui Shen
  • Publication number: 20180336401
    Abstract: Methods and systems for recognizing people in images with increased accuracy are disclosed. In particular, the methods and systems divide images into a plurality of clusters based on common characteristics of the images. The methods and systems also determine an image cluster to which an image with an unknown person instance most corresponds. One or more embodiments determine a probability that the unknown person instance is each known person instance in the image cluster using a trained cluster classifier of the image cluster. Optionally, the methods and systems determine context weights for each combination of an unknown person instance and each known person instance using a conditional random field algorithm based on a plurality of context cues associated with the unknown person instance and the known person instances. The methods and systems calculate a contextual probability based on the cluster-based probabilities and context weights to identify the unknown person instance.
    Type: Application
    Filed: July 30, 2018
    Publication date: November 22, 2018
    Inventors: Jonathan Brandt, Zhe Lin, Xiaohui Shen, Haoxiang Li