Patents by Inventor Peyman Milanfar

Peyman Milanfar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240331091
    Abstract: The technology provides an image resizer that is jointly trainable with neural network classification (recognition) models, and is designed to improve classification performance. Systems and method include applying an input image to a baseline resizer to obtain a default resized image, and applying the input image to a plurality of filters. Each respective filter in the plurality is configured to perform sub-band filtering on the input image to obtain a sub-band filtered result. This includes applying the sub-band filtered result to the baseline resizer to obtain a respective resized result, and also includes applying to the respective resized result a scaling parameter, a bias parameter, and a nonlinear function to obtain a respective filtered image. The process then combines the default resized image and the respective filtered images to generate a combined resized image.
    Type: Application
    Filed: March 29, 2024
    Publication date: October 3, 2024
    Inventors: Hossein Talebi, Zhengzhong Tu, Peyman Milanfar
  • Patent number: 12010440
    Abstract: The present disclosure describes systems and techniques directed to optical image stabilization movement to create a super-resolution image of a scene. The systems and techniques include a user device (102) introducing (502), through an optical image stabilization system (114), movement to one or more components of a camera system (112) of the user device (102). The user device (102) then captures (504) respective and multiple frames (306) of an image of a scene, where the respective and multiple frames (306) of the image of the scene have respective, sub-pixel offsets of the image of the scene across the multiple frames (306) as a result of the introduced movement to the one or more components of the camera system (112). The user device (102) performs (506), based on the respective, sub-pixel offsets of the image of the scene across the respective, multiple frames (306), super-resolution computations and creates (508) the super-resolution image of the scene based on the super-resolution computations.
    Type: Grant
    Filed: March 15, 2023
    Date of Patent: June 11, 2024
    Assignee: Google LLC
    Inventors: Yi Hung Chen, Chia-Kai Liang, Bartlomiej Maciej Wronski, Peyman Milanfar, Ignacio Garcia Dorado
  • Publication number: 20240187715
    Abstract: An example embodiment may involve capturing a sequence of images, wherein there are 4 or more images in the sequence of images, and wherein each of the sequence of images has an exposure length of 4-100 seconds; applying a sliding window over the sequence of images as downsampled, wherein at least 4 images are encompassed within the sliding window’, and wherein for each position of the sliding window the applying involves: (i) aligning a set of images within the sliding window, and (ii) merging the set of images as aligned into a video frame; combining video frames generated by way of the sliding window into a video file; and storing, by the mobile device, the video file in memory of the mobile device.
    Type: Application
    Filed: May 19, 2021
    Publication date: June 6, 2024
    Inventors: Ignacio Garcia Dorado, Shambhavi Punja, Peyman Milanfar, Kiran Murthy, Janne Kontkanen, Isaac Reynolds, Damien Kelly, Alexander Schiffhauer
  • Publication number: 20240169498
    Abstract: Systems and methods for real-time image deblur and stabilization can utilize sensor data for estimating motion blur without the high computational cost of image analysis techniques. The estimated motion blur can then be utilized to generate a motion blur kernel for image correction. The systems and methods can further refine the correction by processing the motion blur kernel with a polynomial filter to generate a sharpening kernel. The systems and methods can provide for real-time correction even with minimal to no stabilization masking.
    Type: Application
    Filed: July 22, 2021
    Publication date: May 23, 2024
    Inventors: Fuhao Shi, Mauricio Delbracio, Chia-Kai Liang, Damien Martin Kelly, Peyman Milanfar
  • Publication number: 20240119560
    Abstract: Described examples relate to an apparatus comprising one or more image sensors coupled to a vehicle and at least one processor. The at least one processor may be configured to capture, in a burst sequence using the one or more image sensors, multiple frames of an image of a scene, the multiple frames having respective, relative offsets of the image across the multiple frames and perform super-resolution computations using the captured, multiple frames of the image of the scene. The at least one processor may also be configured to accumulate, based on the super-resolution computations, color planes and combine, using the one or more processors, the accumulated color planes to create a super-resolution image of the scene.
    Type: Application
    Filed: December 6, 2023
    Publication date: April 11, 2024
    Inventors: Ignacio Garcia Dorado, Damien Kelly, Xiaoying He, Jia Feng, Bartlomiej Wronski, Peyman Milanfar, Lucian Ion
  • Publication number: 20240119555
    Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.
    Type: Application
    Filed: December 4, 2023
    Publication date: April 11, 2024
    Inventors: Junjie Ke, Feng Yang, Qifei Wang, Yilin Wang, Peyman Milanfar
  • Patent number: 11887270
    Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.
    Type: Grant
    Filed: July 1, 2021
    Date of Patent: January 30, 2024
    Assignee: Google LLC
    Inventors: Junjie Ke, Feng Yang, Qifei Wang, Yilin Wang, Peyman Milanfar
  • Patent number: 11880902
    Abstract: Described examples relate to an apparatus comprising one or more image sensors coupled to a vehicle and at least one processor. The at least one processor may be configured to capture, in a burst sequence using the one or more image sensors, multiple frames of an image of a scene, the multiple frames having respective, relative offsets of the image across the multiple frames and perform super-resolution computations using the captured, multiple frames of the image of the scene. The at least one processor may also be configured to accumulate, based on the super-resolution computations, color planes and combine, using the one or more processors, the accumulated color planes to create a super-resolution image of the scene.
    Type: Grant
    Filed: December 30, 2020
    Date of Patent: January 23, 2024
    Assignee: Waymo LLC
    Inventors: Ignacio Garcia-Dorado, Damien Kelly, Xiaoying He, Jia Feng, Bartlomiej Wronski, Peyman Milanfar, Lucian Ion
  • Publication number: 20240022760
    Abstract: Example aspects of the present disclosure are directed to systems and methods which feature a machine-learned video super-resolution (VSR) model which has been trained using a bi-directional training approach. In particular, the present disclosure provides a compression-informed (e.g., compression-aware) super-resolution model that can perform well on real-world videos with different levels of compression. Specifically, example models described herein can include three modules to robustly restore the missing information caused by video compression. First, a bi-directional recurrent module can be used to reduce the accumulated warping error from the random locations of the intra-frame from compressed video frames. Second, a detail-aware flow estimation module can be added to enable recovery of high resolution (HR) flow from compressed low resolution (LR) frames. Finally, a Laplacian enhancement module can add high-frequency information to the warped HR frames washed out by video encoding.
    Type: Application
    Filed: August 5, 2021
    Publication date: January 18, 2024
    Inventors: Yinxiao Li, Peyman Milanfar, Feng Yang, Ce Liu, Ming-Hsuan Yang, Pengchong Jin
  • Publication number: 20240020788
    Abstract: Systems and methods of the present disclosure are directed to a computing system. The computing system can obtain a message vector and video data comprising a plurality of video frames. The computing system can process the input video with a transformation portion of a machine-learned watermark encoding model to obtain a three-dimensional feature encoding of the input video. The computing system can process the three-dimensional feature encoding of the input video and the message vector with an embedding portion of the machine-learned watermark encoding model to obtain spatial-temporal watermark encoding data descriptive of the message vector. The computing system can generate encoded video data comprising a plurality of encoded video frames, wherein at least one of the plurality of encoded video frames includes the spatial-temporal watermark encoding data.
    Type: Application
    Filed: March 24, 2021
    Publication date: January 18, 2024
    Inventors: Xiyang Luo, Feng Yang, Ce Liu, Huiwen Chang, Peyman Milanfar, Yinxiao Li
  • Publication number: 20240013350
    Abstract: Systems, apparatus, and methods are presented for deblurring images. One method includes receiving an image and estimating blur for the image. The method also includes applying a deblurring filter to the image and reducing halo from the image.
    Type: Application
    Filed: November 15, 2021
    Publication date: January 11, 2024
    Inventors: Mauricio DELBRACIO, Peyman MILANFAR
  • Publication number: 20230319327
    Abstract: Methods, systems, and media for determining perceptual quality indicators of video content items are provided.
    Type: Application
    Filed: June 8, 2022
    Publication date: October 5, 2023
    Inventors: Yilin Wang, Balineedu Adsumilli, Junjie Ke, Hossein Talebi, Joong Yim, Neil Birkbeck, Peyman Milanfar, Feng Yang
  • Publication number: 20230267307
    Abstract: Systems and methods of the present disclosure are directed to a method for generating a machine-learned multitask model configured to perform tasks. The method can include obtaining a machine-learned multitask search model comprising candidate nodes. The method can include obtaining tasks and machine-learned task controller models associated with the tasks. As an example, for a task, the method can include using the task controller model to route a subset of the candidate nodes in a machine-learned task submodel for the corresponding task. The method can include inputting task input data to the task submodel to obtain a task output. The method can include generating, using the task output, a feedback value based on an objective function. The method can include adjusting parameters of the task controller model based on the feedback value.
    Type: Application
    Filed: July 23, 2020
    Publication date: August 24, 2023
    Inventors: Qifei Wang, Junjie Ke, Grace Chu, Gabriel Mintzer Bender, Luciano Sbaiz, Feng Yang, Andrew Gerald Howard, Alec Michael Go, Jeffrey M. Gilbert, Peyman Milanfar, Joshua William Charles Greaves
  • Publication number: 20230224596
    Abstract: The present disclosure describes systems and techniques directed to optical image stabilization movement to create a super-resolution image of a scene. The systems and techniques include a user device (102) introducing (502), through an optical image stabilization system (114), movement to one or more components of a camera system (112) of the user device (102). The user device (102) then captures (504) respective and multiple frames (306) of an image of a scene, where the respective and multiple frames (306) of the image of the scene have respective, sub-pixel offsets of the image of the scene across the multiple frames (306) as a result of the introduced movement to the one or more components of the camera system (112). The user device (102) performs (506), based on the respective, sub-pixel offsets of the image of the scene across the respective, multiple frames (306), super-resolution computations and creates (508) the super-resolution image of the scene based on the super-resolution computations.
    Type: Application
    Filed: March 15, 2023
    Publication date: July 13, 2023
    Inventors: Yi Hung Chen, Chia-Kai Liang, Bartlomiej Maciej Wronski, Peyman Milanfar, Ignacio Garcia Dorado
  • Publication number: 20230222623
    Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.
    Type: Application
    Filed: July 1, 2021
    Publication date: July 13, 2023
    Inventors: Junjie Ke, Feng Yang, Qifei Wang, Yilin Wang, Peyman Milanfar
  • Publication number: 20230111326
    Abstract: Methods, systems, and computer programs encoded on a computer storage medium, that relate to extracting digital watermarks from images, irrespective of distortions introduced into these images. Methods can include inputting a first data item into a channel encoder that can generate a first encoded data item that is greater in length than the first data item and that (1) includes the input data item and (2) new data this is redundant of the input data item. Based on the first encoded data item and a first image, an encoder model can generate a first encoded image into which the first encoded data is embedded as a digital watermark. A decoder model can decode the first encoded data item to generate a second data, which can be decoded by the channel decoder to generate data that is predicted to be the first data.
    Type: Application
    Filed: January 13, 2020
    Publication date: April 13, 2023
    Inventors: Ruohan Zhan, Feng Yang, Xiyang Luo, Peyman Milanfar, Huiwen Chang, Ce Liu
  • Patent number: 11611697
    Abstract: The present disclosure describes systems and techniques directed to optical image stabilization movement to create a super-resolution image of a scene. The systems and techniques include a user device (102) introducing (502), through an optical image stabilization system (114), movement to one or more components of a camera system (112) of the user device (102). The user device (102) then captures (504) respective and multiple frames (306) of an image of a scene, where the respective and multiple frames (306) of the image of the scene have respective, sub-pixel offsets of the image of the scene across the multiple frames (306) as a result of the introduced movement to the one or more components of the camera system (112). The user device (102) performs (506), based on the respective, sub-pixel offsets of the image of the scene across the respective, multiple frames (306), super-resolution computations and creates (508) the super-resolution image of the scene based on the super-resolution computations.
    Type: Grant
    Filed: August 6, 2019
    Date of Patent: March 21, 2023
    Assignee: Google LLC
    Inventors: Yi Hung Chen, Chia-Kai Liang, Bartlomiej Maciej Wronski, Peyman Milanfar, Ignacio Garcia Dorado
  • Publication number: 20220415039
    Abstract: A trained model is retrained for video quality assessment and used to identify sets of adaptive compression parameters for transcoding user generated video content. Using transfer learning, the model, which is initially trained for image object detection, is retrained for technical content assessment and then again retrained for video quality assessment. The model is then deployed into a transcoding pipeline and used for transcoding an input video stream of user generated content. The transcoding pipeline may be structured in one of several ways. In one example, a secondary pathway for video content analysis using the model is introduced into the pipeline, which does not interfere with the ultimate output of the transcoding should there be a network or other issue. In another example, the model is introduced as a library within the existing pipeline, which would maintain a single pathway, but ultimately is not expected to introduce significant latency.
    Type: Application
    Filed: November 26, 2019
    Publication date: December 29, 2022
    Inventors: Yilin Wang, Hossein Talebi, Peyman Milanfar, Feng Yang, Balineedu Adsumilli
  • Publication number: 20220207652
    Abstract: Described examples relate to an apparatus comprising one or more image sensors coupled to a vehicle and at least one processor. The at least one processor may be configured to capture, in a burst sequence using the one or more image sensors, multiple frames of an image of a scene, the multiple frames having respective, relative offsets of the image across the multiple frames and perform super-resolution computations using the captured, multiple frames of the image of the scene. The at least one processor may also be configured to accumulate, based on the super-resolution computations, color planes and combine, using the one or more processors, the accumulated color planes to create a super-resolution image of the scene.
    Type: Application
    Filed: December 30, 2020
    Publication date: June 30, 2022
    Inventors: Ignacio Garcia-Dorado, Damien Kelly, Xiaoying He, Jia Feng, Bartlomiej Wronski, Peyman Milanfar, Lucian Ion
  • Publication number: 20210374909
    Abstract: The present disclosure describes systems and techniques directed to optical image stabilization movement to create a super-resolution image of a scene. The systems and techniques include a user device (102) introducing (502), through an optical image stabilization system (114), movement to one or more components of a camera system (112) of the user device (102). The user device (102) then captures (504) respective and multiple frames (306) of an image of a scene, where the respective and multiple frames (306) of the image of the scene have respective, sub-pixel offsets of the image of the scene across the multiple frames (306) as a result of the introduced movement to the one or more components of the camera system (112). The user device (102) performs (506), based on the respective, sub-pixel offsets of the image of the scene across the respective, multiple frames (306), super-resolution computations and creates (508) the super-resolution image of the scene based on the super-resolution computations.
    Type: Application
    Filed: August 6, 2019
    Publication date: December 2, 2021
    Applicant: Google LLC
    Inventors: Yi Hung Chen, Chia-Kai Liang, Bartlomiej Maciej Wronski, Peyman Milanfar, Ignacio Garcia Dorado