Patents by Inventor Yilin Wang
Yilin Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250124537Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.Type: ApplicationFiled: December 23, 2024Publication date: April 17, 2025Inventors: Junjie Ke, Feng Yang, Qifei Wang, Yilin Wang, Peyman Milanfar
-
Patent number: 12273521Abstract: A training dataset that includes a first dataset and a second dataset is received. The first dataset includes a first subset of first videos corresponding to a first context and respective first ground truth quality scores of the first videos, and the second dataset includes a second subset of second videos corresponding to a second context and respective second ground truth quality scores of the second videos. A machine learning model is trained to predict the respective first ground truth quality scores and the respective second ground truth quality scores. Training the model includes training it to obtain a global quality score for one of the videos; and training it to map the global quality score to context-dependent predicted quality scores. The context-dependent predicted quality scores include a first context-dependent predicted quality score corresponding to the first context and a second context-dependent predicted quality score corresponding to the second context.Type: GrantFiled: July 12, 2022Date of Patent: April 8, 2025Assignee: GOOGLE LLCInventors: Yilin Wang, Balineedu Adsumilli
-
Patent number: 12260557Abstract: An image processing system generates an image mask from an image. The image is processed by an object detector to identify a region having an object, and the region is classified based on an object type of the object. A masking pipeline is selected from a number of masking pipelines based on the classification of the region. The region is processed using the masking pipeline to generate a region mask. An image mask for the image is generated using the region mask.Type: GrantFiled: June 13, 2022Date of Patent: March 25, 2025Assignee: adobe inc.Inventors: Zijun Wei, Yilin Wang, Jianming Zhang, He Zhang
-
Patent number: 12250383Abstract: Video streams uploaded to a video hosting platform are transcoded using quality-normalized transcoding parameters dynamically selected using a learning model. Video frames of a video stream are processed using the learning model to determine bitrate and quality score pairs for some or all possible transcoding resolutions. The listing of bitrate and quality score pairs determined for each resolution is processed to determine a set of transcoding parameters for transcoding the video stream into each resolution. The bitrate and quality score pairs of a given listing may be processed using one or more predefined thresholds, which may, in some cases, refer to a weighted distribution of resolutions according to watch times of videos of the video hosting platform. The video stream is then transcoded into the various resolutions using the set of transcoding parameters selected for each resolution.Type: GrantFiled: May 19, 2020Date of Patent: March 11, 2025Assignee: GOOGLE LLCInventors: Yilin Wang, Balineedu Adsumilli
-
Publication number: 20250071299Abstract: Encoding using media compression and processing for machine-learning-based quality metrics includes generating encoded frame data by encoding a current frame from an input video using a neural-network-based video quality model, which includes identifying optimal encoding parameters for encoding a current block, wherein the optimal encoding parameters minimize a rate-distortion optimization cost function, which includes using a gradient value for the current block obtained from a neural-network-based video quality model generated gradient map obtained from the neural-network-based video quality model for the current frame, obtaining a restoration filtered reconstructed frame by restoration filtering a reconstructed frame, obtained by decoding the encoded frame data, using the neural-network-based video quality model generated gradient map obtained for the reconstructed frame.Type: ApplicationFiled: August 24, 2023Publication date: February 27, 2025Inventors: Yao-Chung Lin, Jingning Han, Yilin Wang, Yeping Su
-
Patent number: 12230024Abstract: A trained model is retrained for video quality assessment and used to identify sets of adaptive compression parameters for transcoding user generated video content. Using transfer learning, the model, which is initially trained for image object detection, is retrained for technical content assessment and then again retrained for video quality assessment. The model is then deployed into a transcoding pipeline and used for transcoding an input video stream of user generated content. The transcoding pipeline may be structured in one of several ways. In one example, a secondary pathway for video content analysis using the model is introduced into the pipeline, which does not interfere with the ultimate output of the transcoding should there be a network or other issue. In another example, the model is introduced as a library within the existing pipeline, which would maintain a single pathway, but ultimately is not expected to introduce significant latency.Type: GrantFiled: November 26, 2019Date of Patent: February 18, 2025Assignee: GOOGLE LLCInventors: Yilin Wang, Hossein Talebi, Peyman Milanfar, Feng Yang, Balineedu Adsumilli
-
Patent number: 12223439Abstract: Systems and methods for multi-modal representation learning are described. One or more embodiments provide a visual representation learning system trained using machine learning techniques. For example, some embodiments of the visual representation learning system are trained using cross-modal training tasks including a combination of intra-modal and inter-modal similarity preservation objectives. In some examples, the training tasks are based on contrastive learning techniques.Type: GrantFiled: March 3, 2021Date of Patent: February 11, 2025Assignee: ADOBE INC.Inventors: Xin Yuan, Zhe Lin, Jason Wen Yong Kuen, Jianming Zhang, Yilin Wang, Ajinkya Kale, Baldo Faieta
-
Patent number: 12217382Abstract: The technology employs a patch-based multi-scale Transformer (300) that is usable with various imaging applications. This avoids constraints on image fixed input size and predicts the quality effectively on a native resolution image. A native resolution image (304) is transformed into a multi-scale representation (302), enabling the Transformer's self-attention mechanism to capture information on both fine-grained detailed patches and coarse-grained global patches. Spatial embedding (316) is employed to map patch positions to a fixed grid, in which patch locations at each scale are hashed to the same grid. A separate scale embedding (318) is employed to distinguish patches coming from different scales in the multiscale representation. Self-attention (508) is performed to create a final image representation. In some instances, prior to performing self-attention, the system may prepend a learnable classification token (322) to the set of input tokens.Type: GrantFiled: December 4, 2023Date of Patent: February 4, 2025Assignee: GOOGLE LLCInventors: Junjie Ke, Feng Yang, Qifei Wang, Yilin Wang, Peyman Milanfar
-
Patent number: 12206914Abstract: Methods, systems, and media for determining perceptual quality indicators of video content items are provided.Type: GrantFiled: June 8, 2022Date of Patent: January 21, 2025Assignee: Google LLCInventors: Yilin Wang, Balineedu Adsumilli, Junjie Ke, Hossein Talebi, Joong Yim, Neil Birkbeck, Peyman Milanfar, Feng Yang
-
Publication number: 20240422369Abstract: A method for generating, for a video stream of a first spatial resolution and a first temporal resolution, a first reduced quality steam of a second spatial resolution and a second reduced-quality stream of a second temporal resolution. A first subset of STPs is sampled from the first reduced-quality stream and a second subset of STPs is sampled from the second reduced-quality stream. Using a machine learning model (MLM) the STPs are processed to identify a quality score for each quality-representative STPs that are representative of a quality of the video stream. One or more quality-improving actions for the video stream are identified using the quality scores of the quality-representative STPs.Type: ApplicationFiled: June 16, 2023Publication date: December 19, 2024Inventors: Yilin Wang, Miao Yin, Qifei Wang, Boqing Gong, Neil Aylon Charles Birkbeck, Balineedu Chowdary Adsumilli
-
Publication number: 20240413702Abstract: The present disclosure relates to a connector for a motor, a motor, and a vehicular compressor. The connector has a substrate and a plurality of stator terminals disposed on the substrate. The connector is further disposed with an insulation adhesive receiving part having a body detachably secured on the substrate; an opening for at least two stator terminals of the plurality of stator terminals to pass through; and a panel extending outwardly from a side of the body away from the substrate and circumferentially disposed along the opening for enclosing a space to receive the insulation adhesive. A first sealing portion is disposed on a side of the body proximate to the substrate, and a second sealing portion is disposed at a corresponding location of the substrate. The first sealing portion and the second sealing portion are sealed, press-fit and circumferentially disposed along the opening.Type: ApplicationFiled: June 6, 2024Publication date: December 12, 2024Inventors: Guofu He, Remind Wan, Yilin Wang, Carsten Vollmer
-
Patent number: 12165292Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing a plurality of neural networks in a multi-branch pipeline to generate image masks for digital images. Specifically, the disclosed system can classify a digital image as a portrait or a non-portrait image. Based on classifying a portrait image, the disclosed system can utilize separate neural networks to generate a first mask portion for a portion of the digital image including a defined boundary region and a second mask portion for a portion of the digital image including a blended boundary region. The disclosed system can generate the mask portion for the blended boundary region by utilizing a trimap generation neural network to automatically generate a trimap segmentation including the blended boundary region. The disclosed system can then merge the first mask portion and the second mask portion to generate an image mask for the digital image.Type: GrantFiled: May 15, 2023Date of Patent: December 10, 2024Assignee: Adobe Inc.Inventors: He Zhang, Seyed Morteza Safdarnejad, Yilin Wang, Zijun Wei, Jianming Zhang, Salil Tambe, Brian Price
-
Publication number: 20240404188Abstract: In accordance with the described techniques, a portrait relighting system receives user input defining one or more markings drawn on a portrait image. Using one or more machine learning models, the portrait relighting system generates an albedo representation of the portrait image by removing lighting effects from the portrait image. Further, the portrait relighting system generates a shading map of the portrait image using the one or more machine learning models by designating the one or more markings as a lighting condition, and applying the lighting condition to a geometric representation of the portrait image. The one or more machine learning models are further employed to generate a relit portrait image based on the albedo representation and the shading map.Type: ApplicationFiled: June 2, 2023Publication date: December 5, 2024Applicant: Adobe Inc.Inventors: He Zhang, Zijun Wei, Zhixin Shu, Yiqun Mei, Yilin Wang, Xuaner Zhang, Shi Yan, Jianming Zhang
-
Patent number: 12148074Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately and flexibly generating harmonized digital images utilizing an object-to-object harmonization neural network. For example, the disclosed systems implement, and learn parameters for, an object-to-object harmonization neural network to combine a style code from a reference object with features extracted from a target object. Indeed, the disclosed systems extract a style code from a reference object utilizing a style encoder neural network. In addition, the disclosed systems generate a harmonized target object by applying the style code of the reference object to a target object utilizing an object-to-object harmonization neural network.Type: GrantFiled: October 18, 2021Date of Patent: November 19, 2024Assignee: Adobe Inc.Inventors: He Zhang, Jeya Maria Jose Valanarasu, Jianming Zhang, Jose Ignacio Echevarria Vallespi, Kalyan Sunkavalli, Yilin Wang, Yinglan Ma, Zhe Lin, Zijun Wei
-
Patent number: 12079725Abstract: In some embodiments, an application receives a request to execute a convolutional neural network model. The application determines the computational complexity requirement for the neural network based on the computing resource available on the device. The application further determines the architecture of the convolutional neural network model by determining the locations of down-sampling layers within the convolutional neural network model based on the computational complexity requirement. The application reconfigures the architecture of the convolutional neural network model by moving the down-sampling layers to the determined locations and executes the convolutional neural network model to generate output results.Type: GrantFiled: January 24, 2020Date of Patent: September 3, 2024Assignee: Adobe Inc.Inventors: Zhe Lin, Yilin Wang, Siyuan Qiao, Jianming Zhang
-
Publication number: 20240213844Abstract: An electrical motor stator and a compressor includes a stator assembly and contact plate. The stator assembly includes an iron core; a winding having wires; a first insulating seat; and a plurality of first electrical contacts fixed to the first insulating seat. The contact plate includes a body; a plurality of sleeve seats having a first opening, a second opening, and a cavity extending between the first opening and the second opening. The sleeve seats are positioned to correspond to the first electrical contacts and, through the first opening, at least partially accommodating the first electrical contacts within the cavity. A plurality of second electrical contacts includes a first connecting portion and a second connecting portion, with the first connecting portion disposed within the cavity and in contact with the first electrical contacts, establishing an electrical connection. The second connecting portion is exposed on a side of the body.Type: ApplicationFiled: December 22, 2023Publication date: June 27, 2024Inventors: Guofu He, Zhengmao Wan, Yilin Wang, Fan Cheng, Jean-Marc Ritt, Stephan Kohler
-
Patent number: 12020400Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that upsample and refine segmentation masks. Indeed, in one or more implementations, a segmentation mask refinement and upsampling system upsamples a preliminary segmentation mask utilizing a patch-based refinement process to generate a patch-based refined segmentation mask. The segmentation mask refinement and upsampling system then fuses the patch-based refined segmentation mask with an upsampled version of the preliminary segmentation mask. By fusing the patch-based refined segmentation mask with the upsampled preliminary segmentation mask, the segmentation mask refinement and upsampling system maintains a global perspective and helps avoid artifacts due to the local patch-based refinement process.Type: GrantFiled: January 26, 2022Date of Patent: June 25, 2024Assignee: Adobe Inc.Inventors: Chih-Yao Hsieh, Yilin Wang
-
Publication number: 20240185393Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately, efficiently, and flexibly generating harmonized digital images utilizing a self-supervised image harmonization neural network. In particular, the disclosed systems can implement, and learn parameters for, a self-supervised image harmonization neural network to extract content from one digital image (disentangled from its appearance) and appearance from another from another digital image (disentangled from its content). For example, the disclosed systems can utilize a dual data augmentation method to generate diverse triplets for parameter learning (including input digital images, reference digital images, and pseudo ground truth digital images), via cropping a digital image with perturbations using three-dimensional color lookup tables (“LUTs”).Type: ApplicationFiled: February 13, 2024Publication date: June 6, 2024Inventors: He Zhang, Yifan Jiang, Yilin Wang, Jianming Zhang, Kalyan Sunkavalli, Sarah Kong, Su Chen, Sohrab Amirghodsi, Zhe Lin
-
Publication number: 20240169541Abstract: Systems and methods for instance segmentation are described. Embodiments include identifying an input image comprising an object that includes a visible region and an occluded region that is concealed in the input image. A mask network generates an instance mask for the input image that indicates the visible region of the object. A diffusion model then generates a segmentation mask for the input image based on the instance mask. The segmentation mask indicates a completed region of the object that includes the visible region and the occluded region.Type: ApplicationFiled: November 18, 2022Publication date: May 23, 2024Inventors: Jianming Zhang, Qing Liu, Yilin Wang, Zhe Lin, Bowen Zhang
-
Publication number: 20240161478Abstract: Disclosed are a multimodal weakly-supervised three-dimensional (3D) object detection method and system, and a device. The method includes: shooting multiple two-dimensional (2D) red, green and blue (RGB) images with a camera, acquiring ground points by a vehicle LiDAR sensor and generating a 3D frustum based on 2D box labels on each of the 2D RGB images; filtering ground points in the 3D frustum and selecting a region with most 3D points; generating a 3D pseudo-labeling bounding box of an object according to the region with the most 3D points; training a multimodal superpixel dual-branch network with the 3D pseudo-labeling bounding boxes as labels and the 2D RGB image and the 3D point cloud as inputs; and inputting a 2D RGB image of a current frame and a 3D point cloud of a current scenario to a trained multimodal superpixel dual-branch network to generate an overall 3D point cloud.Type: ApplicationFiled: April 3, 2023Publication date: May 16, 2024Inventors: Huimin MA, Haizhuang LIU, Yilin WANG, Rongquan WANG