Patents by Inventor Michael Rubinstein

Michael Rubinstein has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Re-timing objects in video via layered neural rendering

Patent number: 12243145

Abstract: A computer-implemented method for decomposing videos into multiple layers (212, 213) that can be re-combined with modified relative timings includes obtaining video data including a plurality of image frames (201) depicting one or more objects. For each of the plurality of frames, the computer-implemented method includes generating one or more object maps descriptive of a respective location of at least one object of the one or more objects within the image frame. For each of the plurality of frames, the computer-implemented method includes inputting the image frame and the one or more object maps into a machine-learned layer Tenderer model. (220) For each of the plurality of frames, the computer-implemented method includes receiving, as output from the machine-learned layer Tenderer model, a background layer illustrative of a background of the video data and one or more object layers respectively associated with one of the one or more object maps.

Type: Grant

Filed: May 22, 2020

Date of Patent: March 4, 2025

Assignee: GOOGLE LLC

Inventors: Forrester H. Cole, Erika Lu, Tali Dekel, William T. Freeman, David Henry Salesin, Michael Rubinstein
AUDIO-VISUAL HEARING AID

Publication number: 20240428816

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

Type: Application

Filed: August 7, 2024

Publication date: December 26, 2024

Inventors: Anatoly Efros, Noam Etzion-Rosenberg, Tal Remez, Oran Lang, Inbar Mosseri, Israel Or Weinstein, Benjamin Schlesinger, Michael Rubinstein, Ariel Ephrat, Yukun Zhu, Stella Laurenzo, Amit Pitaru, Yossi Matias
DIFFUSION-GUIDED THREE-DIMENSIONAL RECONSTRUCTION

Publication number: 20240412458

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for editing images based on decoder-based accumulative score sampling (DASS) losses.

Type: Application

Filed: June 12, 2024

Publication date: December 12, 2024

Inventors: Varun Jampani, Chun-Han Yao, Amit Raj, Wei-Chih Hung, Ming-Hsuan Yang, Michael Rubinstein, Yuanzhen Li
Optimizing Generative Machine-Learned Models for Subject-Driven Text-to-3D Generation

Publication number: 20240320912

Abstract: A fractional training process can be performed training images to an instance of a machine-learned generative image model to obtain a partially trained instance of the model. A fractional optimization process can be performed with the partially trained instance to an instance of a machine-learned three-dimensional (3D) implicit representation model obtain a partially optimized instance of the model. Based on the plurality of training images, pseudo multi-view subject images can be generated with the partially optimized instance of the 3D implicit representation model and a fully trained instance of the generative image model; The partially trained instance of the model can be trained with a set of training data. The partially optimized instance of the machine-learned 3D implicit representation model can be trained with the machine-learned multi-view image model.

Type: Application

Filed: March 20, 2024

Publication date: September 26, 2024

Inventors: Yuanzhen Li, Amit Raj, Varun Jampani, Benjamin Joseph Mildenhall, Benjamin Michael Poole, Jonathan Tilton Barron, Kfir Aberman, Michael Niemeyer, Michael Rubinstein, Nataniel Ruiz Gutierrez, Shiran Elyahu Zada, Srinivas Kaza
PERSONALIZED TEXT-TO-IMAGE DIFFUSION MODEL

Publication number: 20240296596

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a text-to-image model so that the text-to-image model generates images that each depict a variable instance of an object class when the object class without the unique identifier is provided as a text input, and that generates images that each depict a same subject instance of the object class when the unique identifier is provided as the text input.

Type: Application

Filed: August 23, 2023

Publication date: September 5, 2024

Inventors: Kfir Aberman, Nataniel Ruiz Gutierrez, Michael Rubinstein, Yuanzhen Li, Yael Pritch Knaan, Varun Jampani
Audio-visual hearing aid

Patent number: 12073844

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

Type: Grant

Filed: October 1, 2020

Date of Patent: August 27, 2024

Assignee: Google LLC

Inventors: Anatoly Efros, Noam Etzion-Rosenberg, Tal Remez, Oran Lang, Inbar Mosseri, Israel Or Weinstein, Benjamin Schlesinger, Michael Rubinstein, Ariel Ephrat, Yukun Zhu, Stella Laurenzo, Amit Pitaru, Yossi Matias
Systems and Methods for Identifying and Extracting Object-Related Effects in Videos

Publication number: 20240249523

Abstract: The present disclosure provides systems and methods for identifying and extracting object-related effects in videos. Given an ordinary video and a rough segmentation mask overtime of one or more subjects of interest, example systems proposed herein can estimate an omnimatte for each subject—an alpha matte and color image that includes the subject along with all its related time-varying scene elements. Example implementations of the proposed models can be trained only on the input video in a self-supervised manner, without any manual labels, and are generic. For example, the models can produce omnimattes automatically for arbitrary objects and a variety of effects.

Type: Application

Filed: May 11, 2022

Publication date: July 25, 2024

Inventors: Forrester H. Cole, Andrew Zisserman, Tali Dekel, William Tafel Freeman, Erika Lu, Michael Rubinstein
CROSSLINKING COMPOUNDS AND CROSSLINKED ACRYLIC POLYMERIC MATERIALS

Publication number: 20240043369

Abstract: Disclosed herein are cyclobutane-based crosslinking compounds that, when incorporated into acrylate-based polymeric materials, can produce toughened acrylate polymer networks. Also disclosed herein are polymers comprising the crosslinkers, methods of preparing toughened polymer networks using the crosslinkers, and methods of using the polymer networks.

Type: Application

Filed: June 28, 2023

Publication date: February 8, 2024

Inventors: Stephen L. Craig, Jeremiah A. Johnson, Shu Wang, Michael Rubinstein, Abraham Herzog-Arbeitman
Audio-visual speech separation

Patent number: 11894014

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: obtaining, for each frame in a stream of frames from a video in which faces of one or more speakers have been detected, a respective per-frame face embedding of the face of each speaker; processing, for each speaker, the per-frame face embeddings of the face of the speaker to generate visual features for the face of the speaker; obtaining a spectrogram of an audio soundtrack for the video; processing the spectrogram to generate an audio embedding for the audio soundtrack; combining the visual features for the one or more speakers and the audio embedding for the audio soundtrack to generate an audio-visual embedding for the video; determining a respective spectrogram mask for each of the one or more speakers; and determining a respective isolated speech spectrogram for each speaker.

Type: Grant

Filed: September 22, 2022

Date of Patent: February 6, 2024

Assignee: Google LLC

Inventors: Inbar Mosseri, Michael Rubinstein, Ariel Ephrat, William Freeman, Oran Lang, Kevin William Wilson, Tali Dekel, Avinatan Hassidim
Heart Rate and Respiratory Rate Measurements from Imagery

Publication number: 20230277069

Abstract: Generally, the present disclosure is directed to systems and methods for measuring heart rate and respiratory rate using a camera such as, for example, a smartphone camera or other consumer-grade camera. Specifically, the present disclosure presents and validates two algorithms that make use of smartphone cameras (or the like) for measuring heart rate (HR) and respiratory rate (RR) for consumer wellness use. As an example, HR can be measured by placing the finger of a subject over the rear-facing camera. As another example, RR can be measured via a video of the subject sitting still in front of the front-facing camera.

Type: Application

Filed: March 3, 2022

Publication date: September 7, 2023

Inventors: Jiening Zhan, Sean Kyungmok Bae, Silviu Borac, Yunus Emre, Jonathan Wesor Wang, Jiang Wu, Mehr Kashyap, Ming Jack Po, Liwen Chen, Melissa Chung, John Cannon, Eric Steven Teasley, James Alexander Taylor, Jr., Michael Vincent McConnell, Alejandra Maciel, Allen KC Chai, Shwetak Patel, Gregory Sean Corrado, Si-Hyuck Kang, Yun Liu, Michael Rubinstein, Michael Spencer Krainin, Neal Wadhwa
AUDIO-VISUAL HEARING AID

Publication number: 20230267942

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: receiving, by a user device, a first indication of one or more first speakers visible in a current view recorded by a camera of the user device, in response, generating a respective isolated speech signal for each of the one or more first speakers that isolates speech of the first speaker in the current view and sending the isolated speech signals for each of the one or more first speakers to a listening device operatively coupled to the user device, receiving, by the user device, a second indication of one or more second speakers visible in the current view recorded by the camera of the user device, and in response generating and sending a respective isolated speech signal for each of the one or more second speakers to the listening device.

Type: Application

Filed: October 1, 2020

Publication date: August 24, 2023

Inventors: Anatoly Efros, Noam Etzion-Rosenberg, Tal Remez, Oran Lang, Inbar Mosseri, Israel Or Weinstein, Benjamin Schlesinger, Michael Rubinstein, Ariel Ephrat, Yukun Zhu, Stella Laurenzo, Amit Pitaru, Yossi Matias
Re-Timing Objects in Video Via Layered Neural Rendering

Publication number: 20230206955

Abstract: A computer-implemented method for decomposing videos into multiple layers (212, 213) that can be re-combined with modified relative timings includes obtaining video data including a plurality of image frames (201) depicting one or more objects. For each of the plurality of frames, the computer-implemented method includes generating one or more object maps descriptive of a respective location of at least one object of the one or more objects within the image frame. For each of the plurality of frames, the computer-implemented method includes inputting the image frame and the one or more object maps into a machine-learned layer Tenderer model. (220) For each of the plurality of frames, the computer-implemented method includes receiving, as output from the machine-learned layer Tenderer model, a background layer illustrative of a background of the video data and one or more object layers respectively associated with one of the one or more object maps.

Type: Application

Filed: May 22, 2020

Publication date: June 29, 2023

Inventors: Forrester H. Cole, Erika Lu, Tali Dekel, William T. Freeman, David Henry Salesin, Michael Rubinstein
AUDIO-VISUAL SPEECH SEPARATION

Publication number: 20230122905

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: obtaining, for each frame in a stream of frames from a video in which faces of one or more speakers have been detected, a respective per-frame face embedding of the face of each speaker; processing, for each speaker, the per-frame face embeddings of the face of the speaker to generate visual features for the face of the speaker; obtaining a spectrogram of an audio soundtrack for the video; processing the spectrogram to generate an audio embedding for the audio soundtrack; combining the visual features for the one or more speakers and the audio embedding for the audio soundtrack to generate an audio-visual embedding for the video; determining a respective spectrogram mask for each of the one or more speakers; and determining a respective isolated speech spectrogram for each speaker.

Type: Application

Filed: September 22, 2022

Publication date: April 20, 2023

Inventors: Inbar Mosseri, Michael Rubinstein, Ariel Ephrat, William Freeman, Oran Lang, Kevin William Wilson, Tali Dekel, Avinatan Hassidim
Deep Saliency Prior

Publication number: 20230015117

Abstract: Techniques for tuning an image editing operator for reducing a distractor in raw image data are presented herein. The image editing operator can access the raw image data and a mask. The mask can indicate a region of interest associated with the raw image data. The image editing operator can process the raw image data and the mask to generate processed image data. Additionally, a trained saliency model can process at least the processed image data within the region of interest to generate a saliency map that provides saliency values. Moreover, a saliency loss function can compare the saliency values provided by the saliency map for the processed image data within the region of interest to one or more target saliency values. Subsequently, the one or more parameter values of the image editing operator can be modified based at least in part on the saliency loss function.

Type: Application

Filed: July 1, 2022

Publication date: January 19, 2023

Inventors: Kfir Aberman, David Edward Jacobs, Kai Jochen Kohlhoff, Michael Rubinstein, Yossi Gandelsman, Junfeng He, Inbar Mosseri, Yael Pritch Knaan
Analysis and visualization of subtle motions in videos

Patent number: 11526996

Abstract: Example embodiments allow for fast, efficient motion-magnification of video streams by decomposing image frames of the video stream into local phase information at multiple spatial scales and/or orientations. The phase information for each image frame is then scaled to magnify local motion and the scaled phase information is transformed back into image frames to generate a motion-magnified video stream. Scaling of the phase information can include temporal filtering of the phase information across image frames, for example, to magnify motion at a particular frequency. In some embodiments, temporal filtering of phase information at a frequency of breathing, cardiovascular pulse, or some other process of interest allows for motion-magnification of motions within the video stream corresponding to the breathing or the other particular process of interest. The phase information can also be used to determine time-varying motion signals corresponding to motions of interest within the video stream.

Type: Grant

Filed: June 20, 2019

Date of Patent: December 13, 2022

Assignee: Google LLC

Inventors: Michael Rubinstein, Derek Debusschere, Mike Krainin, Ce Liu
Adaptive glare removal and/or color correction

Patent number: 11483463

Abstract: Some implementations relate to determining whether glare is present in captured image(s) of an object (e.g., a photo) and/or to determining one or more attributes of any present glare. Some of those implementations further relate to adapting one or more parameters for a glare removal process based on whether the glare is determined to be present and/or based on one or more of the determined attributes of any glare determined to be present. Some additional and/or alternative implementations disclosed herein relate to correcting color of a flash image of an object (e.g., a photo). The flash image is based on one or more images captured by a camera of a client device with a flash component of the client device activated. In various implementations, correcting the color of the flash image is based on a determined color space of an ambient image of the object.

Type: Grant

Filed: May 26, 2020

Date of Patent: October 25, 2022

Assignee: Google LLC

Inventors: Julia Winn, Abraham Stephens, Daniel Pettigrew, Aaron Maschinot, Ce Liu, Michael Krainin, Michael Rubinstein, Jingyu Cui
Audio-visual speech separation

Patent number: 11456005

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for audio-visual speech separation. A method includes: obtaining, for each frame in a stream of frames from a video in which faces of one or more speakers have been detected, a respective per-frame face embedding of the face of each speaker; processing, for each speaker, the per-frame face embeddings of the face of the speaker to generate visual features for the face of the speaker; obtaining a spectrogram of an audio soundtrack for the video; processing the spectrogram to generate an audio embedding for the audio soundtrack; combining the visual features for the one or more speakers and the audio embedding for the audio soundtrack to generate an audio-visual embedding for the video; determining a respective spectrogram mask for each of the one or more speakers; and determining a respective isolated speech spectrogram for each speaker.

Type: Grant

Filed: November 21, 2018

Date of Patent: September 27, 2022

Assignee: Google LLC

Inventors: Inbar Mosseri, Michael Rubinstein, Ariel Ephrat, William Freeman, Oran Lang, Kevin William Wilson, Tali Dekel, Avinatan Hassidim
Taking photos through visual obstructions

Patent number: 11050948

Abstract: The present disclosure relates to systems and methods for image capture. Namely, an image capture system may include a camera configured to capture images of a field of view, a display, and a controller. An initial image of the field of view from an initial camera pose may be captured. An obstruction may be determined to be observable in the field of view. Based on the obstruction, at least one desired camera pose may be determined. The at least one desired camera pose includes at least one desired position of the camera. A capture interface may be displayed, which may include instructions for moving the camera to the at least one desired camera pose. At least one further image of the field of view from the at least one desired camera pose may be captured. Captured images may be processed to remove the obstruction from a background image.

Type: Grant

Filed: October 21, 2019

Date of Patent: June 29, 2021

Assignee: Google LLC

Inventors: Michael Rubinstein, William Freeman, Ce Liu
Devices and systems with fluidic nanofunnels for processing single molecules

Patent number: 10996212

Abstract: Methods of forming a chip with fluidic channels include forming (e.g., milling) at least one nanofunnel with a wide end and a narrow end into a planar substrate, the nanofunnel having a length, with width and depth dimensions that both vary over its length and forming (e.g., milling) at least one nanochannel into the planar substrate at an interface adjacent the narrow end of the nanofunnel.

Type: Grant

Filed: May 2, 2018

Date of Patent: May 4, 2021

Assignee: The University of North Carolina at Chapel Hill

Inventors: John Michael Ramsey, Laurent Menard, Jinsheng Zhou, Michael Rubinstein, Sergey Panyukov
ANALYSIS AND VISUALIZATION OF SUBTLE MOTIONS IN VIDEOS

Publication number: 20210110549

Abstract: Example embodiments allow for fast, efficient motion-magnification of video streams by decomposing image frames of the video stream into local phase information at multiple spatial scales and/or orientations. The phase information for each image frame is then scaled to magnify local motion and the scaled phase information is transformed back into image frames to generate a motion-magnified video stream. Scaling of the phase information can include temporal filtering of the phase information across image frames, for example, to magnify motion at a particular frequency. In some embodiments, temporal filtering of phase information at a frequency of breathing, cardiovascular pulse, or some other process of interest allows for motion-magnification of motions within the video stream corresponding to the breathing or the other particular process of interest. The phase information can also be used to determine time-varying motion signals corresponding to motions of interest within the video stream.

Type: Application

Filed: June 20, 2019

Publication date: April 15, 2021

Inventors: Michael RUBINSTEIN, Derek DEBUSSCHERE, Mike KRAININ, Ce LIU

1 2 3 next