Patents by Inventor Oliver Wang

Oliver Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Video Diffusion Model

Publication number: 20250238905

Abstract: Provided is a video generation model for performing text-to-video (T2V) or other video generation techniques. The proposed model reduces the computational costs associated with video generation. In particular, unlike traditional T2V methods, the disclosed technology can generate the full temporal duration of a video clip at once, bypassing the need for extensive computation. As one example, a machine-learned denoising diffusion model can simultaneously process a plurality of noisy inputs that correspond to various timestamps spanning the temporal dimension of a video to simultaneously generate synthetic frames for the video that match the timestamps.

Type: Application

Filed: January 22, 2025

Publication date: July 24, 2025

Inventors: Inbar Mosseri, Omer Bar Tal, Hila Chefer-Livshen, Omer Tov, Charles Irwin Herrmann, Rony Paiss, Shiran Elyahu Zada, Ariel Ephrat, Junhwa Hur, Guanghui Liu, Amit Raj, Yuanzhen Li, Michael Rubinstein, Tomer Michaeli, Oliver Wang, Deqing Sun, Tali Dekel
Multi-scale output techniques for generative adversarial networks

Patent number: 12333427

Abstract: An improved system architecture uses a Generative Adversarial Network (GAN) including a specialized generator neural network to generate multiple resolution output images. The system produces a latent space representation of an input image. The system generates a first output image at a first resolution by providing the latent space representation of the input image as input to a generator neural network comprising an input layer, an output layer, and a plurality of intermediate layers and taking the first output image from an intermediate layer, of the plurality of intermediate layers of the generator neural network. The system generates a second output image at a second resolution different from the first resolution by providing the latent space representation of the input image as input to the generator neural network and taking the second output image from the output layer of the generator neural network.

Type: Grant

Filed: July 23, 2021

Date of Patent: June 17, 2025

Assignee: Adobe Inc.

Inventors: Cameron Smith, Ratheesh Kalarot, Wei-An Lin, Richard Zhang, Niloy Mitra, Elya Shechtman, Shabnam Ghadar, Zhixin Shu, Yannick Hold-Geoffrey, Nathan Carr, Jingwan Lu, Oliver Wang, Jun-Yan Zhu
Corrective lighting for video inpainting

Patent number: 12322074

Abstract: One or more processing devices access a scene depicting a reference object that includes an annotation identifying a target region to be modified in one or more video frames. The one or more processing devices determine that a target pixel corresponds to a sub-region within the target region that includes hallucinated content. The one or more processing devices determine gradient constraints using gradient values of neighboring pixels in the hallucinated content, the neighboring pixels being adjacent to the target pixel and corresponding to four cardinal directions. The one or more processing devices update color data of the target pixel subject to the determined gradient constraints.

Type: Grant

Filed: September 29, 2023

Date of Patent: June 3, 2025

Assignee: Adobe Inc.

Inventors: Oliver Wang, John Nelson, Geoffrey Oxholm, Elya Shechtman
VECTOR FONT GENERATION BASED ON CASCADED DIFFUSION

Publication number: 20250124212

Abstract: In implementation of techniques for vector font generation based on cascaded diffusion, a computing device implements a glyph generation system to receive a sample glyph in a target font and a target glyph identifier. The glyph generation system generates a rasterized glyph in the target font using a raster diffusion model based on the sample glyph and the target glyph identifier, the rasterized glyph having a first level of resolution. The glyph generation system then generates a vector glyph using a vector diffusion model by vectorizing the rasterized glyph, the vector glyph having a second level of resolution different than the first level of resolution. The glyph generation system then displays the vector glyph in a user interface.

Type: Application

Filed: November 13, 2023

Publication date: April 17, 2025

Applicant: Adobe Inc.

Inventors: Difan Liu, Matthew David Fisher, Michaël Yanis Gharbi, Oliver Wang, Alec Stefan Jacobson, Vikas Thamizharasan, Evangelos Kalogerakis
Machine learning based image calibration using dense fields

Patent number: 12236640

Abstract: Systems and methods for image dense field based view calibration are provided. In one embodiment, an input image is applied to a dense field machine learning model that generates a vertical vector dense field (VVF) and a latitude dense field (LDF) from the input image. The VVF comprises a vertical vector of a projected vanishing point direction for each of the pixels of the input image. The latitude dense field (LDF) comprises a projected latitude value for the pixels of the input image. A dense field map for the input image comprising the VVF and the LDF can be directly or indirectly used for a variety of image processing manipulations. The VVF and LDF can be optionally used to derive traditional camera calibration parameters from uncontrolled images that have undergone undocumented or unknown manipulations.

Type: Grant

Filed: March 28, 2022

Date of Patent: February 25, 2025

Assignee: Adobe Inc.

Inventors: Jianming Zhang, Linyi Jin, Kevin Matzen, Oliver Wang, Yannick Hold-Geoffroy
LEARNING A 3D SCENE GENERATION MODEL FROM IMAGES OF A SELF-SIMILAR SCENE

Publication number: 20250061647

Abstract: A scene modeling system accesses a set of input two-dimensional (2D) images of a three-dimensional (3D) environment, wherein the input 2D images captured from a plurality of camera orientations. The environment includes first content. The scene modeling system applies a scene generation model to the set of input 2D images to generate a 3D remix scene. Applying the scene generation model includes configuring the scene generation model using at least a 2D discriminator and a 3D discriminator. Applying the scene generation model includes transmitting, for display via a user interface, the 3D remix scene. The 3D remix scene includes second content that is different from the first content.

Type: Application

Filed: August 14, 2023

Publication date: February 20, 2025

Inventors: Oliver Wang, Animesh Karnewar, Tobias Ritschel, Niloy Mitra
TEXT-BASED IMAGE GENERATION USING AN IMAGE-TRAINED TEXT

Publication number: 20240320873

Abstract: A method, apparatus, non-transitory computer readable medium, and system for image generation include obtaining a text prompt and encoding, using a text encoder jointly trained with an image generation model, the text prompt to obtain a text embedding. Some embodiments generate, using the image generation model, a synthetic image based on the text embedding.

Type: Application

Filed: February 12, 2024

Publication date: September 26, 2024

Inventors: Tobias Hinz, Ali Aminian, Hao Tan, Kushal Kafle, Oliver Wang, Jingwan Lu
HIGH-RESOLUTION IMAGE GENERATION

Publication number: 20240320789

Abstract: A method, non-transitory computer readable medium, apparatus, and system for image generation include obtaining an input image having a first resolution, where the input image includes random noise, and generating a low-resolution image based on the input image, where the low-resolution image has the first resolution. The method, non-transitory computer readable medium, apparatus, and system further include generating a high-resolution image based on the low-resolution image, where the high-resolution image has a second resolution that is greater than the first resolution.

Type: Application

Filed: February 23, 2024

Publication date: September 26, 2024

Inventors: Tobias Hinz, Taesung Park, Jingwan Lu, Elya Shechtman, Richard Zhang, Oliver Wang
IMAGE GENERATION USING A TEXT AND IMAGE CONDITIONED MACHINE LEARNING MODEL

Publication number: 20240320872

Abstract: A method, apparatus, non-transitory computer readable medium, and system for image generation include obtaining a text embedding of a text prompt and an image embedding of an image prompt. Some embodiments map the text embedding into a joint embedding space to obtain a joint text embedding and map the image embedding into the joint embedding space to obtain a joint image embedding. Some embodiments generate a synthetic image based on the joint text embedding and the joint image embedding.

Type: Application

Filed: January 30, 2024

Publication date: September 26, 2024

Inventors: Tobias Hinz, Venkata Naveen Kumar Yadav Marri, Midhun Harikumar, Ajinkya Gorakhnath Kale, Zhe Lin, Oliver Wang, Jingwan Lu
View synthesis of a dynamic scene

Patent number: 12039657

Abstract: Embodiments of the technology described herein, provide a view and time synthesis of dynamic scenes captured by a camera. The technology described herein represents a dynamic scene as a continuous function of both space and time. The technology may parameterize this function with a deep neural network (a multi-layer perceptron (MLP)), and perform rendering using volume tracing. At a very high level, a dynamic scene depicted in the video may be used to train the MLP. Once trained, the MLP is able to synthesize a view of the scene at a time and/or camera pose not found in the video through prediction. As used herein, a dynamic scene comprises one or more moving objects.

Type: Grant

Filed: March 17, 2021

Date of Patent: July 16, 2024

Assignee: Adobe Inc.

Inventors: Oliver Wang, Simon Niklaus, Zhengqi Li
DIFFUSION MODELS HAVING CONTINUOUS SCALING THROUGH PATCH-WISE IMAGE GENERATION

Publication number: 20240161327

Abstract: Aspects of the methods, apparatus, non-transitory computer readable medium, and systems include obtaining a noise map and a global image code encoded from an original image and representing semantic content of the original image; generating a plurality of image patches based on the noise map and the global image code using a diffusion model; and combining the plurality of image patches to produce an output image including the semantic content.

Type: Application

Filed: November 4, 2022

Publication date: May 16, 2024

Inventors: Yinbo Chen, Michaël Gharbi, Oliver Wang, Richard Zhang, Elya Shechtman
Method and system for providing and displaying optional overlays

Patent number: 11936936

Abstract: A method including receiving video of an event; generating an overlay for the video; generating an information message containing information enabling a receiver of the video and the overlay to selectively display or hide the overlay; and transmitting the video, the overlay, and the information message. The video is transmitted in a primary stream of a multi-stream transmission including a primary stream and one or more auxiliary streams. The overlay is transmitted in a first one of the auxiliary streams.

Type: Grant

Filed: October 9, 2014

Date of Patent: March 19, 2024

Assignee: DISNEY ENTERPRISES, INC.

Inventors: Aljoscha Smolic, Nikolce Stefanoski, Oliver Wang
Automated digital parameter adjustment for digital images

Patent number: 11930303

Abstract: Systems and techniques for automatic digital parameter adjustment are described that leverage insights learned from an image set to automatically predict parameter values for an input item of digital visual content. To do so, the automatic digital parameter adjustment techniques described herein captures visual and contextual features of digital visual content to determine balanced visual output in a range of visual scenes and settings. The visual and contextual features of digital visual content are used to train a parameter adjustment model through machine learning techniques that captures feature patterns and interactions. The parameter adjustment model exploits these feature interactions to determine visually pleasing parameter values for an input item of digital visual content. The predicted parameter values are output, allowing further adjustment to the parameter values.

Type: Grant

Filed: November 15, 2021

Date of Patent: March 12, 2024

Assignee: Adobe Inc.

Inventors: Pulkit Gera, Oliver Wang, Kalyan Krishna Sunkavalli, Elya Shechtman, Chetan Nanda
Refining image acquisition data through domain adaptation

Patent number: 11908036

Abstract: The technology described herein is directed to a cross-domain training framework that iteratively trains a domain adaptive refinement agent to refine low quality real-world image acquisition data, e.g., depth maps, when accompanied by corresponding conditional data from other modalities, such as the underlying images or video from which the image acquisition data is computed. The cross-domain training framework includes a shared cross-domain encoder and two conditional decoder branch networks, e.g., a synthetic conditional depth prediction branch network and a real conditional depth prediction branch network. The shared cross-domain encoder converts synthetic and real-world image acquisition data into synthetic and real compact feature representations, respectively.

Type: Grant

Filed: September 28, 2020

Date of Patent: February 20, 2024

Assignee: Adobe Inc.

Inventors: Oliver Wang, Jianming Zhang, Dingzeyu Li, Zekun Hao
Deep encoder for performing audio processing

Patent number: 11900902

Abstract: Embodiments are disclosed for determining an answer to a query associated with a graphical representation of data. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input including an unprocessed audio sequence and a request to perform an audio signal processing effect on the unprocessed audio sequence. The one or more embodiments further include analyzing, by a deep encoder, the unprocessed audio sequence to determine parameters for processing the unprocessed audio sequence. The one or more embodiments further include sending the unprocessed audio sequence and the parameters to one or more audio signal processing effects plugins to perform the requested audio signal processing effect using the parameters and outputting a processed audio sequence after processing of the unprocessed audio sequence using the parameters of the one or more audio signal processing effects plugins.

Type: Grant

Filed: April 12, 2021

Date of Patent: February 13, 2024

Assignee: Adobe Inc.

Inventors: Marco Antonio Martinez Ramirez, Nicholas J. Bryan, Oliver Wang, Paris Smaragdis
Corrective Lighting for Video Inpainting

Publication number: 20240046430

Abstract: One or more processing devices access a scene depicting a reference object that includes an annotation identifying a target region to be modified in one or more video frames. The one or more processing devices determine that a target pixel corresponds to a sub-region within the target region that includes hallucinated content. The one or more processing devices determine gradient constraints using gradient values of neighboring pixels in the hallucinated content, the neighboring pixels being adjacent to the target pixel and corresponding to four cardinal directions. The one or more processing devices update color data of the target pixel subject to the determined gradient constraints.

Type: Application

Filed: September 29, 2023

Publication date: February 8, 2024

Inventors: Oliver Wang, John Nelson, Geoffrey Oxholm, Elya Shechtman
Generating modified digital images utilizing a global and spatial autoencoder

Patent number: 11893763

Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for generating a modified digital image from extracted spatial and global codes. For example, the disclosed systems can utilize a global and spatial autoencoder to extract spatial codes and global codes from digital images. The disclosed systems can further utilize the global and spatial autoencoder to generate a modified digital image by combining extracted spatial and global codes in various ways for various applications such as style swapping, style blending, and attribute editing.

Type: Grant

Filed: November 22, 2022

Date of Patent: February 6, 2024

Assignee: Adobe Inc.

Inventors: Taesung Park, Richard Zhang, Oliver Wang, Junyan Zhu, Jingwan Lu, Elya Shechtman, Alexei A Efros
Techniques for domain to domain projection using a generative model

Patent number: 11880766

Abstract: An improved system architecture uses a pipeline including a Generative Adversarial Network (GAN) including a generator neural network and a discriminator neural network to generate an image. An input image in a first domain and information about a target domain are obtained. The domains correspond to image styles. An initial latent space representation of the input image is produced by encoding the input image. An initial output image is generated by processing the initial latent space representation with the generator neural network. Using the discriminator neural network, a score is computed indicating whether the initial output image is in the target domain. A loss is computed based on the computed score. The loss is minimized to compute an updated latent space representation. The updated latent space representation is processed with the generator neural network to generate an output image in the target domain.

Type: Grant

Filed: July 23, 2021

Date of Patent: January 23, 2024

Assignee: Adobe Inc.

Inventors: Cameron Smith, Ratheesh Kalarot, Wei-An Lin, Richard Zhang, Niloy Mitra, Elya Shechtman, Shabnam Ghadar, Zhixin Shu, Yannick Hold-Geoffrey, Nathan Carr, Jingwan Lu, Oliver Wang, Jun-Yan Zhu
Optimization of adaptive convolutions for video frame interpolation

Patent number: 11871145

Abstract: Embodiments are disclosed for video image interpolation. In some embodiments, video image interpolation includes receiving a pair of input images from a digital video, determining, using a neural network, a plurality of spatially varying kernels each corresponding to a pixel of an output image, convolving a first set of spatially varying kernels with a first input image from the pair of input images and a second set of spatially varying kernels with a second input image from the pair of input images to generate filtered images, and generating the output image by performing kernel normalization on the filtered images.

Type: Grant

Filed: April 6, 2021

Date of Patent: January 9, 2024

Assignee: Adobe Inc.

Inventors: Simon Niklaus, Oliver Wang, Long Mai
Temporally distributed neural networks for video semantic segmentation

Patent number: 11854206

Abstract: A Video Semantic Segmentation System (VSSS) is disclosed that performs accurate and fast semantic segmentation of videos using a set of temporally distributed neural networks. The VSSS receives as input a video signal comprising a contiguous sequence of temporally-related video frames. The VSSS extracts features from the video frames in the contiguous sequence and based upon the extracted features, selects, from a set of labels, a label to be associated with each pixel of each video frame in the video signal. In certain embodiments, a set of multiple neural networks are used to extract the features to be used for video segmentation and the extraction of features is distributed among the multiple neural networks in the set. A strong feature representation representing the entirety of the features is produced for each video frame in the sequence of video frames by aggregating the output features extracted by the multiple neural networks.

Type: Grant

Filed: May 3, 2022

Date of Patent: December 26, 2023

Assignee: Adobe Inc.

Inventors: Federico Perazzi, Zhe Lin, Ping Hu, Oliver Wang, Fabian David Caba Heilbron

1 2 3 4 5 … next