Patents by Inventor Oliver Wang
Oliver Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220182588Abstract: Systems and techniques for automatic digital parameter adjustment are described that leverage insights learned from an image set to automatically predict parameter values for an input item of digital visual content. To do so, the automatic digital parameter adjustment techniques described herein captures visual and contextual features of digital visual content to determine balanced visual output in a range of visual scenes and settings. The visual and contextual features of digital visual content are used to train a parameter adjustment model through machine learning techniques that captures feature patterns and interactions. The parameter adjustment model exploits these feature interactions to determine visually pleasing parameter values for an input item of digital visual content. The predicted parameter values are output, allowing further adjustment to the parameter values.Type: ApplicationFiled: November 15, 2021Publication date: June 9, 2022Applicant: Adobe Inc.Inventors: Pulkit Gera, Oliver Wang, Kalyan Krishna Sunkavalli, Elya Shechtman, Chetan Nanda
-
Patent number: 11354906Abstract: A Video Semantic Segmentation System (VSSS) is disclosed that performs accurate and fast semantic segmentation of videos using a set of temporally distributed neural networks. The VSSS receives as input a video signal comprising a contiguous sequence of temporally-related video frames. The VSSS extracts features from the video frames in the contiguous sequence and based upon the extracted features, selects, from a set of labels, a label to be associated with each pixel of each video frame in the video signal. In certain embodiments, a set of multiple neural networks are used to extract the features to be used for video segmentation and the extraction of features is distributed among the multiple neural networks in the set. A strong feature representation representing the entirety of the features is produced for each video frame in the sequence of video frames by aggregating the output features extracted by the multiple neural networks.Type: GrantFiled: April 13, 2020Date of Patent: June 7, 2022Assignee: Adobe Inc.Inventors: Federico Perazzi, Zhe Lin, Ping Hu, Oliver Wang, Fabian David Caba Heilbron
-
Patent number: 11328523Abstract: The present disclosure relates to an image composite system that employs a generative adversarial network to generate realistic composite images. For example, in one or more embodiments, the image composite system trains a geometric prediction neural network using an adversarial discrimination neural network to learn warp parameters that provide correct geometric alignment of foreground objects with respect to a background image. Once trained, the determined warp parameters provide realistic geometric corrections to foreground objects such that the warped foreground objects appear to blend into background images naturally when composited together.Type: GrantFiled: June 9, 2020Date of Patent: May 10, 2022Assignee: Adobe Inc.Inventors: Elya Shechtman, Oliver Wang, Mehmet Yumer, Chen-Hsuan Lin
-
Publication number: 20220122222Abstract: An improved system architecture uses a Generative Adversarial Network (GAN) including a specialized generator neural network to generate multiple resolution output images. The system produces a latent space representation of an input image. The system generates a first output image at a first resolution by providing the latent space representation of the input image as input to a generator neural network comprising an input layer, an output layer, and a plurality of intermediate layers and taking the first output image from an intermediate layer, of the plurality of intermediate layers of the generator neural network. The system generates a second output image at a second resolution different from the first resolution by providing the latent space representation of the input image as input to the generator neural network and taking the second output image from the output layer of the generator neural network.Type: ApplicationFiled: July 23, 2021Publication date: April 21, 2022Inventors: Cameron Smith, Ratheesh Kalarot, Wei-An Lin, Richard Zhang, Niloy Mitra, Elya Shechtman, Shabnam Ghadar, Zhixin Shu, Yannick Hold-Geoffrey, Nathan Carr, Jingwan Lu, Oliver Wang, Jun-Yan Zhu
-
Publication number: 20220122221Abstract: An improved system architecture uses a pipeline including a Generative Adversarial Network (GAN) including a generator neural network and a discriminator neural network to generate an image. An input image in a first domain and information about a target domain are obtained. The domains correspond to image styles. An initial latent space representation of the input image is produced by encoding the input image. An initial output image is generated by processing the initial latent space representation with the generator neural network. Using the discriminator neural network, a score is computed indicating whether the initial output image is in the target domain. A loss is computed based on the computed score. The loss is minimized to compute an updated latent space representation. The updated latent space representation is processed with the generator neural network to generate an output image in the target domain.Type: ApplicationFiled: July 23, 2021Publication date: April 21, 2022Inventors: Cameron Smith, Ratheesh Kalarot, Wei-An Lin, Richard Zhang, Niloy Mitra, Elya Shechtman, Shabnam Ghadar, Zhixin Shu, Yannick Hold-Geoffrey, Nathan Carr, Jingwan Lu, Oliver Wang, Jun-Yan Zhu
-
Publication number: 20220122305Abstract: An improved system architecture uses a pipeline including an encoder and a Generative Adversarial Network (GAN) including a generator neural network to generate edited images with improved speed, realism, and identity preservation. The encoder produces an initial latent space representation of an input image by encoding the input image. The generator neural network generates an initial output image by processing the initial latent space representation of the input image. The system generates an optimized latent space representation of the input image using a loss minimization technique that minimizes a loss between the input image and the initial output image. The loss is based on target perceptual features extracted from the input image and initial perceptual features extracted from the initial output image. The system outputs the optimized latent space representation of the input image for downstream use.Type: ApplicationFiled: July 23, 2021Publication date: April 21, 2022Inventors: Cameron Smith, Ratheesh Kalarot, Wei-An Lin, Richard Zhang, Niloy Mitra, Elya Shechtman, Shabnam Ghadar, Zhixin Shu, Yannick Hold-Geoffrey, Nathan Carr, Jingwan Lu, Oliver Wang, Jun-Yan Zhu
-
Publication number: 20220114365Abstract: Methods and systems are provided for facilitating large-scale augmented reality in relation to outdoor scenes using estimated camera pose information. In particular, camera pose information for an image can be estimated by matching the image to a rendered ground-truth terrain model with known camera pose information. To match images with such renders, data driven cross-domain feature embedding can be learned using a neural network. Cross-domain feature descriptors can be used for efficient and accurate feature matching between the image and the terrain model renders. This feature matching allows images to be localized in relation to the terrain model, which has known camera pose information. This known camera pose information can then be used to estimate camera pose information in relation to the image.Type: ApplicationFiled: October 12, 2020Publication date: April 14, 2022Inventors: Michal Lukác, Oliver Wang, Jan Brejcha, Yannick Hold-Geoffroy, Martin Cadík
-
Publication number: 20220101476Abstract: The technology described herein is directed to a cross-domain training framework that iteratively trains a domain adaptive refinement agent to refine low quality real-world image acquisition data, e.g., depth maps, when accompanied by corresponding conditional data from other modalities, such as the underlying images or video from which the image acquisition data is computed. The cross-domain training framework includes a shared cross-domain encoder and two conditional decoder branch networks, e.g., a synthetic conditional depth prediction branch network and a real conditional depth prediction branch network. The shared cross-domain encoder converts synthetic and real-world image acquisition data into synthetic and real compact feature representations, respectively.Type: ApplicationFiled: September 28, 2020Publication date: March 31, 2022Inventors: Oliver Wang, Jianming Zhang, Dingzeyu Li, Zekun Hao
-
Publication number: 20220060671Abstract: This disclosure relates to methods, non-transitory computer readable media, and systems that generate and dynamically change filter parameters for a frame of a 360-degree video based on detecting a field of view from a computing device. As a computing device rotates or otherwise changes orientation, for instance, the disclosed systems can detect a field of view and interpolate one or more filter parameters corresponding to nearby spatial keyframes of the 360-degree video to generate view-specific-filter parameters. By generating and storing filter parameters for spatial keyframes corresponding to different times and different view directions, the disclosed systems can dynamically adjust color grading or other visual effects using interpolated, view-specific-filter parameters to render a filtered version of the 360-degree video.Type: ApplicationFiled: November 4, 2021Publication date: February 24, 2022Inventors: Stephen DiVerdi, Seth Walker, Oliver Wang, Cuong Nguyen
-
Patent number: 11257298Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for reconstructing three-dimensional meshes from two-dimensional images of objects with automatic coordinate system alignment. For example, the disclosed system can generate feature vectors for a plurality of images having different views of an object. The disclosed system can process the feature vectors to generate coordinate-aligned feature vectors aligned with a coordinate system associated with an image. The disclosed system can generate a combined feature vector from the feature vectors aligned to the coordinate system. Additionally, the disclosed system can then generate a three-dimensional mesh representing the object from the combined feature vector.Type: GrantFiled: March 18, 2020Date of Patent: February 22, 2022Assignee: Adobe Inc.Inventors: Vladimir Kim, Pierre-alain Langlois, Oliver Wang, Matthew Fisher, Bryan Russell
-
Patent number: 11244204Abstract: In implementations of determining video cuts in video clips, a video cut detection system can receive a video clip that includes a sequence of digital video frames that depict one or more scenes. The video cut detection system can determine scene characteristics for the digital video frames. The video cut detection system can determine, from the scene characteristics, a probability of a video cut between two adjacent digital video frames having a boundary between the two adjacent digital video frames that is centered in the sequence of digital video frames. The video cut detection system can then compare the probability of the video cut to a cut threshold to determine whether the video cut exists between the two adjacent digital video frames.Type: GrantFiled: May 20, 2020Date of Patent: February 8, 2022Assignee: Adobe Inc.Inventors: Oliver Wang, Nico Alexander Becherer, Markus Woodson, Federico Perazzi, Nikhil Kalra
-
Patent number: 11189094Abstract: Techniques are disclosed for 3D object reconstruction using photometric mesh representations. A decoder is pretrained to transform points sampled from 2D patches of representative objects into 3D polygonal meshes. An image frame of the object is fed into an encoder to get an initial latent code vector. For each frame and camera pair from the sequence, a polygonal mesh is rendered at the given viewpoints. The mesh is optimized by creating a virtual viewpoint, rasterized to obtain a depth map. The 3D mesh projections are aligned by projecting the coordinates corresponding to the polygonal face vertices of the rasterized mesh to both selected viewpoints. The photometric error is determined from RGB pixel intensities sampled from both frames. Gradients from the photometric error are backpropagated into the vertices of the assigned polygonal indices by relating the barycentric coordinates of each image to update the latent code vector.Type: GrantFiled: August 5, 2020Date of Patent: November 30, 2021Assignee: Adobe, Inc.Inventors: Oliver Wang, Vladimir Kim, Matthew Fisher, Elya Shechtman, Chen-Hsuan Lin, Bryan Russell
-
Publication number: 20210365742Abstract: In implementations of determining video cuts in video clips, a video cut detection system can receive a video clip that includes a sequence of digital video frames that depict one or more scenes. The video cut detection system can determine scene characteristics for the digital video frames. The video cut detection system can determine, from the scene characteristics, a probability of a video cut between two adjacent digital video frames having a boundary between the two adjacent digital video frames that is centered in the sequence of digital video frames. The video cut detection system can then compare the probability of the video cut to a cut threshold to determine whether the video cut exists between the two adjacent digital video frames.Type: ApplicationFiled: May 20, 2020Publication date: November 25, 2021Applicant: Adobe Inc.Inventors: Oliver Wang, Nico Alexander Becherer, Markus Woodson, Federico Perazzi, Nikhil Kalra
-
Publication number: 20210358177Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for generating a modified digital image from extracted spatial and global codes. For example, the disclosed systems can utilize a global and spatial autoencoder to extract spatial codes and global codes from digital images. The disclosed systems can further utilize the global and spatial autoencoder to generate a modified digital image by combining extracted spatial and global codes in various ways for various applications such as style swapping, style blending, and attribute editing.Type: ApplicationFiled: May 14, 2020Publication date: November 18, 2021Inventors: Taesung Park, Richard Zhang, Oliver Wang, Junyan Zhu, Jingwan Lu, Elya Shechtman, Alexei A Efros
-
Patent number: 11178368Abstract: Systems and techniques for automatic digital parameter adjustment are described that leverage insights learned from an image set to automatically predict parameter values for an input item of digital visual content. To do so, the automatic digital parameter adjustment techniques described herein captures visual and contextual features of digital visual content to determine balanced visual output in a range of visual scenes and settings. The visual and contextual features of digital visual content are used to train a parameter adjustment model through machine learning techniques that captures feature patterns and interactions. The parameter adjustment model exploits these feature interactions to determine visually pleasing parameter values for an input item of digital visual content. The predicted parameter values are output, allowing further adjustment to the parameter values.Type: GrantFiled: November 26, 2019Date of Patent: November 16, 2021Assignee: Adobe Inc.Inventors: Pulkit Gera, Oliver Wang, Kalyan Krishna Sunkavalli, Elya Shechtman, Chetan Nanda
-
Patent number: 11178374Abstract: This disclosure relates to methods, non-transitory computer readable media, and systems that generate and dynamically change filter parameters for a frame of a 360-degree video based on detecting a field of view from a computing device. As a computing device rotates or otherwise changes orientation, for instance, the disclosed systems can detect a field of view and interpolate one or more filter parameters corresponding to nearby spatial keyframes of the 360-degree video to generate view-specific-filter parameters. By generating and storing filter parameters for spatial keyframes corresponding to different times and different view directions, the disclosed systems can dynamically adjust color grading or other visual effects using interpolated, view-specific-filter parameters to render a filtered version of the 360-degree video.Type: GrantFiled: May 31, 2019Date of Patent: November 16, 2021Assignee: ADOBE INC.Inventors: Stephen DiVerdi, Seth Walker, Oliver Wang, Cuong Nguyen
-
Patent number: 11158090Abstract: This disclosure involves training generative adversarial networks to shot-match two unmatched images in a context-sensitive manner. For example, aspects of the present disclosure include accessing a trained generative adversarial network including a trained generator model and a trained discriminator model. A source image and a reference image may be inputted into the generator model to generate a modified source image. The modified source image and the reference image may be inputted into the discriminator model to determine a likelihood that the modified source image is color-matched with the reference image. The modified source image may be outputted as a shot-match with the reference image in response to determining, using the discriminator model, that the modified source image and the reference image are color-matched.Type: GrantFiled: November 22, 2019Date of Patent: October 26, 2021Assignee: Adobe Inc.Inventors: Tharun Mohandoss, Pulkit Gera, Oliver Wang, Kartik Sethi, Kalyan Sunkavalli, Elya Shechtman, Chetan Nanda
-
Publication number: 20210319232Abstract: A Video Semantic Segmentation System (VSSS) is disclosed that performs accurate and fast semantic segmentation of videos using a set of temporally distributed neural networks. The VSSS receives as input a video signal comprising a contiguous sequence of temporally-related video frames. The VSSS extracts features from the video frames in the contiguous sequence and based upon the extracted features, selects, from a set of labels, a label to be associated with each pixel of each video frame in the video signal. In certain embodiments, a set of multiple neural networks are used to extract the features to be used for video segmentation and the extraction of features is distributed among the multiple neural networks in the set. A strong feature representation representing the entirety of the features is produced for each video frame in the sequence of video frames by aggregating the output features extracted by the multiple neural networks.Type: ApplicationFiled: April 13, 2020Publication date: October 14, 2021Inventors: Federico Perazzi, Zhe Lin, Ping Hu, Oliver Wang, Fabian David Caba Heilbron
-
Publication number: 20210304799Abstract: Certain embodiments involve transcript-based techniques for facilitating insertion of secondary video content into primary video content. For instance, a video editor presents a video editing interface having a primary video section displaying a primary video, a text-based navigation section having navigable portions of a primary video transcript, and a secondary video menu section displaying candidate secondary videos. In some embodiments, candidate secondary videos are obtained by using target terms detected in the transcript to query a remote data source for the candidate secondary videos. In embodiments involving video insertion, the video editor identifies a portion of the primary video corresponding to a portion of the transcript selected within the text-based navigation section. The video editor inserts a secondary video, which is selected from the candidate secondary videos based on an input received at the secondary video menu section, at the identified portion of the primary video.Type: ApplicationFiled: June 11, 2021Publication date: September 30, 2021Inventors: Bernd Huber, Bryan Russell, Gautham Mysore, Hijung Valentina Shin, Oliver Wang
-
Publication number: 20210295606Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for reconstructing three-dimensional meshes from two-dimensional images of objects with automatic coordinate system alignment. For example, the disclosed system can generate feature vectors for a plurality of images having different views of an object. The disclosed system can process the feature vectors to generate coordinate-aligned feature vectors aligned with a coordinate system associated with an image. The disclosed system can generate a combined feature vector from the feature vectors aligned to the coordinate system. Additionally, the disclosed system can then generate a three-dimensional mesh representing the object from the combined feature vector.Type: ApplicationFiled: March 18, 2020Publication date: September 23, 2021Inventors: Vladimir Kim, Pierre-alain Langlois, Oliver Wang, Matthew Fisher, Bryan Russell