Patents by Inventor Peyman Milanfar
Peyman Milanfar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240331091Abstract: The technology provides an image resizer that is jointly trainable with neural network classification (recognition) models, and is designed to improve classification performance. Systems and method include applying an input image to a baseline resizer to obtain a default resized image, and applying the input image to a plurality of filters. Each respective filter in the plurality is configured to perform sub-band filtering on the input image to obtain a sub-band filtered result. This includes applying the sub-band filtered result to the baseline resizer to obtain a respective resized result, and also includes applying to the respective resized result a scaling parameter, a bias parameter, and a nonlinear function to obtain a respective filtered image. The process then combines the default resized image and the respective filtered images to generate a combined resized image.Type: ApplicationFiled: March 29, 2024Publication date: October 3, 2024Inventors: Hossein Talebi, Zhengzhong Tu, Peyman Milanfar
-
Patent number: 12010440Abstract: The present disclosure describes systems and techniques directed to optical image stabilization movement to create a super-resolution image of a scene. The systems and techniques include a user device (102) introducing (502), through an optical image stabilization system (114), movement to one or more components of a camera system (112) of the user device (102). The user device (102) then captures (504) respective and multiple frames (306) of an image of a scene, where the respective and multiple frames (306) of the image of the scene have respective, sub-pixel offsets of the image of the scene across the multiple frames (306) as a result of the introduced movement to the one or more components of the camera system (112). The user device (102) performs (506), based on the respective, sub-pixel offsets of the image of the scene across the respective, multiple frames (306), super-resolution computations and creates (508) the super-resolution image of the scene based on the super-resolution computations.Type: GrantFiled: March 15, 2023Date of Patent: June 11, 2024Assignee: Google LLCInventors: Yi Hung Chen, Chia-Kai Liang, Bartlomiej Maciej Wronski, Peyman Milanfar, Ignacio Garcia Dorado
-
Publication number: 20240187715Abstract: An example embodiment may involve capturing a sequence of images, wherein there are 4 or more images in the sequence of images, and wherein each of the sequence of images has an exposure length of 4-100 seconds; applying a sliding window over the sequence of images as downsampled, wherein at least 4 images are encompassed within the sliding window’, and wherein for each position of the sliding window the applying involves: (i) aligning a set of images within the sliding window, and (ii) merging the set of images as aligned into a video frame; combining video frames generated by way of the sliding window into a video file; and storing, by the mobile device, the video file in memory of the mobile device.Type: ApplicationFiled: May 19, 2021Publication date: June 6, 2024Inventors: Ignacio Garcia Dorado, Shambhavi Punja, Peyman Milanfar, Kiran Murthy, Janne Kontkanen, Isaac Reynolds, Damien Kelly, Alexander Schiffhauer
-
Publication number: 20240169498Abstract: Systems and methods for real-time image deblur and stabilization can utilize sensor data for estimating motion blur without the high computational cost of image analysis techniques. The estimated motion blur can then be utilized to generate a motion blur kernel for image correction. The systems and methods can further refine the correction by processing the motion blur kernel with a polynomial filter to generate a sharpening kernel. The systems and methods can provide for real-time correction even with minimal to no stabilization masking.Type: ApplicationFiled: July 22, 2021Publication date: May 23, 2024Inventors: Fuhao Shi, Mauricio Delbracio, Chia-Kai Liang, Damien Martin Kelly, Peyman Milanfar
-
Publication number: 20240119560Abstract: Described examples relate to an apparatus comprising one or more image sensors coupled to a vehicle and at least one processor. The at least one processor may be configured to capture, in a burst sequence using the one or more image sensors, multiple frames of an image of a scene, the multiple frames having respective, relative offsets of the image across the multiple frames and perform super-resolution computations using the captured, multiple frames of the image of the scene. The at least one processor may also be configured to accumulate, based on the super-resolution computations, color planes and combine, using the one or more processors, the accumulated color planes to create a super-resolution image of the scene.Type: ApplicationFiled: December 6, 2023Publication date: April 11, 2024Inventors: Ignacio Garcia Dorado, Damien Kelly, Xiaoying He, Jia Feng, Bartlomiej Wronski, Peyman Milanfar, Lucian Ion
-
Publication number: 20240119555Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.Type: ApplicationFiled: December 4, 2023Publication date: April 11, 2024Inventors: Junjie Ke, Feng Yang, Qifei Wang, Yilin Wang, Peyman Milanfar
-
Patent number: 11887270Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.Type: GrantFiled: July 1, 2021Date of Patent: January 30, 2024Assignee: Google LLCInventors: Junjie Ke, Feng Yang, Qifei Wang, Yilin Wang, Peyman Milanfar
-
Patent number: 11880902Abstract: Described examples relate to an apparatus comprising one or more image sensors coupled to a vehicle and at least one processor. The at least one processor may be configured to capture, in a burst sequence using the one or more image sensors, multiple frames of an image of a scene, the multiple frames having respective, relative offsets of the image across the multiple frames and perform super-resolution computations using the captured, multiple frames of the image of the scene. The at least one processor may also be configured to accumulate, based on the super-resolution computations, color planes and combine, using the one or more processors, the accumulated color planes to create a super-resolution image of the scene.Type: GrantFiled: December 30, 2020Date of Patent: January 23, 2024Assignee: Waymo LLCInventors: Ignacio Garcia-Dorado, Damien Kelly, Xiaoying He, Jia Feng, Bartlomiej Wronski, Peyman Milanfar, Lucian Ion
-
Publication number: 20240022760Abstract: Example aspects of the present disclosure are directed to systems and methods which feature a machine-learned video super-resolution (VSR) model which has been trained using a bi-directional training approach. In particular, the present disclosure provides a compression-informed (e.g., compression-aware) super-resolution model that can perform well on real-world videos with different levels of compression. Specifically, example models described herein can include three modules to robustly restore the missing information caused by video compression. First, a bi-directional recurrent module can be used to reduce the accumulated warping error from the random locations of the intra-frame from compressed video frames. Second, a detail-aware flow estimation module can be added to enable recovery of high resolution (HR) flow from compressed low resolution (LR) frames. Finally, a Laplacian enhancement module can add high-frequency information to the warped HR frames washed out by video encoding.Type: ApplicationFiled: August 5, 2021Publication date: January 18, 2024Inventors: Yinxiao Li, Peyman Milanfar, Feng Yang, Ce Liu, Ming-Hsuan Yang, Pengchong Jin
-
Publication number: 20240020788Abstract: Systems and methods of the present disclosure are directed to a computing system. The computing system can obtain a message vector and video data comprising a plurality of video frames. The computing system can process the input video with a transformation portion of a machine-learned watermark encoding model to obtain a three-dimensional feature encoding of the input video. The computing system can process the three-dimensional feature encoding of the input video and the message vector with an embedding portion of the machine-learned watermark encoding model to obtain spatial-temporal watermark encoding data descriptive of the message vector. The computing system can generate encoded video data comprising a plurality of encoded video frames, wherein at least one of the plurality of encoded video frames includes the spatial-temporal watermark encoding data.Type: ApplicationFiled: March 24, 2021Publication date: January 18, 2024Inventors: Xiyang Luo, Feng Yang, Ce Liu, Huiwen Chang, Peyman Milanfar, Yinxiao Li
-
Publication number: 20240013350Abstract: Systems, apparatus, and methods are presented for deblurring images. One method includes receiving an image and estimating blur for the image. The method also includes applying a deblurring filter to the image and reducing halo from the image.Type: ApplicationFiled: November 15, 2021Publication date: January 11, 2024Inventors: Mauricio DELBRACIO, Peyman MILANFAR
-
Publication number: 20230319327Abstract: Methods, systems, and media for determining perceptual quality indicators of video content items are provided.Type: ApplicationFiled: June 8, 2022Publication date: October 5, 2023Inventors: Yilin Wang, Balineedu Adsumilli, Junjie Ke, Hossein Talebi, Joong Yim, Neil Birkbeck, Peyman Milanfar, Feng Yang
-
Publication number: 20230267307Abstract: Systems and methods of the present disclosure are directed to a method for generating a machine-learned multitask model configured to perform tasks. The method can include obtaining a machine-learned multitask search model comprising candidate nodes. The method can include obtaining tasks and machine-learned task controller models associated with the tasks. As an example, for a task, the method can include using the task controller model to route a subset of the candidate nodes in a machine-learned task submodel for the corresponding task. The method can include inputting task input data to the task submodel to obtain a task output. The method can include generating, using the task output, a feedback value based on an objective function. The method can include adjusting parameters of the task controller model based on the feedback value.Type: ApplicationFiled: July 23, 2020Publication date: August 24, 2023Inventors: Qifei Wang, Junjie Ke, Grace Chu, Gabriel Mintzer Bender, Luciano Sbaiz, Feng Yang, Andrew Gerald Howard, Alec Michael Go, Jeffrey M. Gilbert, Peyman Milanfar, Joshua William Charles Greaves
-
Publication number: 20230224596Abstract: The present disclosure describes systems and techniques directed to optical image stabilization movement to create a super-resolution image of a scene. The systems and techniques include a user device (102) introducing (502), through an optical image stabilization system (114), movement to one or more components of a camera system (112) of the user device (102). The user device (102) then captures (504) respective and multiple frames (306) of an image of a scene, where the respective and multiple frames (306) of the image of the scene have respective, sub-pixel offsets of the image of the scene across the multiple frames (306) as a result of the introduced movement to the one or more components of the camera system (112). The user device (102) performs (506), based on the respective, sub-pixel offsets of the image of the scene across the respective, multiple frames (306), super-resolution computations and creates (508) the super-resolution image of the scene based on the super-resolution computations.Type: ApplicationFiled: March 15, 2023Publication date: July 13, 2023Inventors: Yi Hung Chen, Chia-Kai Liang, Bartlomiej Maciej Wronski, Peyman Milanfar, Ignacio Garcia Dorado
-
Publication number: 20230222623Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.Type: ApplicationFiled: July 1, 2021Publication date: July 13, 2023Inventors: Junjie Ke, Feng Yang, Qifei Wang, Yilin Wang, Peyman Milanfar
-
Publication number: 20230111326Abstract: Methods, systems, and computer programs encoded on a computer storage medium, that relate to extracting digital watermarks from images, irrespective of distortions introduced into these images. Methods can include inputting a first data item into a channel encoder that can generate a first encoded data item that is greater in length than the first data item and that (1) includes the input data item and (2) new data this is redundant of the input data item. Based on the first encoded data item and a first image, an encoder model can generate a first encoded image into which the first encoded data is embedded as a digital watermark. A decoder model can decode the first encoded data item to generate a second data, which can be decoded by the channel decoder to generate data that is predicted to be the first data.Type: ApplicationFiled: January 13, 2020Publication date: April 13, 2023Inventors: Ruohan Zhan, Feng Yang, Xiyang Luo, Peyman Milanfar, Huiwen Chang, Ce Liu
-
Patent number: 11611697Abstract: The present disclosure describes systems and techniques directed to optical image stabilization movement to create a super-resolution image of a scene. The systems and techniques include a user device (102) introducing (502), through an optical image stabilization system (114), movement to one or more components of a camera system (112) of the user device (102). The user device (102) then captures (504) respective and multiple frames (306) of an image of a scene, where the respective and multiple frames (306) of the image of the scene have respective, sub-pixel offsets of the image of the scene across the multiple frames (306) as a result of the introduced movement to the one or more components of the camera system (112). The user device (102) performs (506), based on the respective, sub-pixel offsets of the image of the scene across the respective, multiple frames (306), super-resolution computations and creates (508) the super-resolution image of the scene based on the super-resolution computations.Type: GrantFiled: August 6, 2019Date of Patent: March 21, 2023Assignee: Google LLCInventors: Yi Hung Chen, Chia-Kai Liang, Bartlomiej Maciej Wronski, Peyman Milanfar, Ignacio Garcia Dorado
-
Publication number: 20220415039Abstract: A trained model is retrained for video quality assessment and used to identify sets of adaptive compression parameters for transcoding user generated video content. Using transfer learning, the model, which is initially trained for image object detection, is retrained for technical content assessment and then again retrained for video quality assessment. The model is then deployed into a transcoding pipeline and used for transcoding an input video stream of user generated content. The transcoding pipeline may be structured in one of several ways. In one example, a secondary pathway for video content analysis using the model is introduced into the pipeline, which does not interfere with the ultimate output of the transcoding should there be a network or other issue. In another example, the model is introduced as a library within the existing pipeline, which would maintain a single pathway, but ultimately is not expected to introduce significant latency.Type: ApplicationFiled: November 26, 2019Publication date: December 29, 2022Inventors: Yilin Wang, Hossein Talebi, Peyman Milanfar, Feng Yang, Balineedu Adsumilli
-
Publication number: 20220207652Abstract: Described examples relate to an apparatus comprising one or more image sensors coupled to a vehicle and at least one processor. The at least one processor may be configured to capture, in a burst sequence using the one or more image sensors, multiple frames of an image of a scene, the multiple frames having respective, relative offsets of the image across the multiple frames and perform super-resolution computations using the captured, multiple frames of the image of the scene. The at least one processor may also be configured to accumulate, based on the super-resolution computations, color planes and combine, using the one or more processors, the accumulated color planes to create a super-resolution image of the scene.Type: ApplicationFiled: December 30, 2020Publication date: June 30, 2022Inventors: Ignacio Garcia-Dorado, Damien Kelly, Xiaoying He, Jia Feng, Bartlomiej Wronski, Peyman Milanfar, Lucian Ion
-
Publication number: 20210374909Abstract: The present disclosure describes systems and techniques directed to optical image stabilization movement to create a super-resolution image of a scene. The systems and techniques include a user device (102) introducing (502), through an optical image stabilization system (114), movement to one or more components of a camera system (112) of the user device (102). The user device (102) then captures (504) respective and multiple frames (306) of an image of a scene, where the respective and multiple frames (306) of the image of the scene have respective, sub-pixel offsets of the image of the scene across the multiple frames (306) as a result of the introduced movement to the one or more components of the camera system (112). The user device (102) performs (506), based on the respective, sub-pixel offsets of the image of the scene across the respective, multiple frames (306), super-resolution computations and creates (508) the super-resolution image of the scene based on the super-resolution computations.Type: ApplicationFiled: August 6, 2019Publication date: December 2, 2021Applicant: Google LLCInventors: Yi Hung Chen, Chia-Kai Liang, Bartlomiej Maciej Wronski, Peyman Milanfar, Ignacio Garcia Dorado