Patents by Inventor Scott Labrozzi
Scott Labrozzi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12634488Abstract: A system includes a tunable neural network-based video encoder configured to receive a video sequence including multiple video frames, generate a frame-specific embedding of a first video frame of the multiple video frames, and identify one or more group-of-pictures (GOP) features of a subset of the multiple video frames, the subset of the including the first video frame. The tunable neural network-based video encoder is further configured to combine the frame-specific embedding of the first video frame and the one or more GOP features of the first plurality of the plurality of video frames to provide a latent feature corresponding to a compressed version of the first video frame.Type: GrantFiled: October 18, 2024Date of Patent: May 19, 2026Assignees: Disney Enterprises, Inc., ETH ZÜRICH (EIDGENÖSSISCHE TECHNISCHE HOCHSCHULE ZÜRICH)Inventors: Roberto Gerson de Albuquerque Azevedo, Christopher Richard Schroers, Scott Labrozzi, Jens Eirik Saethre, Yuanyì Xue
-
Patent number: 12621502Abstract: A system processing hardware executes a machine learning (ML) model-based video compression encoder to receive uncompressed video content and corresponding motion compensated video content, compare the uncompressed and motion compensated video content to identify an image space residual, transform the image space residual to a latent space representation of the uncompressed video content, and transform, using a trained image compression ML model, the motion compensated video content to a latent space representation of the motion compensated video content.Type: GrantFiled: September 13, 2024Date of Patent: May 5, 2026Assignees: Disney Enterprises, Inc., ETH Zürich (EIDGENÖSSISCHE TECHNISCHE HOCHSCHULE ZÜRICH)Inventors: Abdelaziz Djelouah, Leonhard Markus Helminger, Roberto Gerson de Albuquerque Azevedo, Scott Labrozzi, Christopher Richard Schroers, Yuanyi Xue
-
Patent number: 12621506Abstract: In some embodiments, a method analyzes flagged locations from a plurality of locations in an encoding of a video to form a cluster of locations. Draft micro-chunk boundaries for the cluster are determined based on searching for a first start location and a first end location in the encoding. The method searches in a first search range before the first start location and a second search range after the first end location for a second start location in the first search range and a second end location in the second search range. The second start location and the second end location form a micro-chunk. An encoding parameter set is determined for the micro-chunk formed by the second start location and the second end location based on content characteristics of the micro-chunk. The method uses the encoding parameter set to encode the micro-chunk for insertion in the encoding of the video.Type: GrantFiled: September 25, 2023Date of Patent: May 5, 2026Assignees: Disney Enterprises, Inc., Beijing YoJaJa Software Technology Development Co., Ltd.Inventors: Yuanyi Xue, Roberto Gerson De Albuquerque Azevedo, Christopher Richard Schroers, Scott Labrozzi, Wenhao Zhang
-
Publication number: 20260095599Abstract: In some embodiments, a method receives an image and encodes the image into a latent representation in a latent space. A quantization process is performed on the latent representation to generate a quantized latent representation. The quantization process is based on a uniform noise. The method transmits the quantized latent representation to a receiver. An inverse quantization process is performed to generate a reconstructed latent representation via a diffusion model that performs a denoising process for a number of iterations based on a time step t to remove noise from the reconstructed latent representation. The diffusion model is trained to perform denoising using the uniform noise.Type: ApplicationFiled: February 18, 2025Publication date: April 2, 2026Applicants: Disney Enterprises, Inc., ETH Zürich (Eidgenössische Technische Hochschule Zürich)Inventors: Lucas Relic, Roberto Gerson De Albuquerque Azevedo, Yang Zhang, Christopher Richard Schroers, Yuanyi Xue, Scott Labrozzi
-
Patent number: 12587654Abstract: A method receives a video. The method analyzes information for a pixel of a frame in the video to determine a first value and a second value for the pixel. The first value is based on an image structure formed by the pixel in the frame and the second value is based on interframe motion of the image structure at the pixel. A third value is determined for an amount of judder based on the first value and the second value. The method outputs the third value to evaluate the video.Type: GrantFiled: February 5, 2024Date of Patent: March 24, 2026Assignee: Disney Enterprises, Inc.Inventors: Christopher Richard Schroers, Blake Sloan, Mitchel Jacobs, Scott Labrozzi, Shinobu Hattori, Felix Klose
-
Publication number: 20260039898Abstract: A system includes a hardware processor and a memory storing a video/audio (V/A) synchronizer including video and audio encoders. The hardware processor executes the V/A synchronizer to receive raw video and audio extracted from media content, partition the raw video into video frame patches, partition the raw audio into audio samples, pre-process the video frame patches and the audio samples for encoding. The hardware processor further executes the V/A synchronizer to encode, using the video encoder, the pre-processed video frame patches to provide pre-processed and encoded video frame patches used to provide a latent representation of the raw video, encode, using the audio encoder, the pre-processed audio samples to provide pre-processed and encoded audio samples used to provide a latent representation of the raw audio, and synchronize, using the latent representations of the raw video and the raw audio, the raw audio with the raw video.Type: ApplicationFiled: September 23, 2025Publication date: February 5, 2026Inventors: Clara Fernandez Labrador, Cafer Mertcan Akcay, Christopher Richard Schroers, Joan Massich Vall, Scott Labrozzi, Mitchel Jacobs, Katherine Hinsen, Eitan Abecassis
-
Publication number: 20250386026Abstract: In some embodiments, a method determines an instance of content and a metric to evaluate a quality of an encoding of the instance of content. A set of features is extracted. The method performs an optimized search process to evaluate different combinations of encoding parameter values that are used to encode the content to generate instances of encoded content. The instances of encoded content are compared to the metric to determine a next combination of encoding parameter values to use. An optimal combination of encoding parameter values is selected that is associated with one of the instances of encoded content. Predicted encoding parameter values are output from a model using model parameters based on an input of the set of features. The method is trained using the optimal combination of encoding parameter values and the predicted encoding parameter values, wherein the model parameters are adjusted in the training.Type: ApplicationFiled: September 16, 2024Publication date: December 18, 2025Applicants: Disney Enterprises, Inc., Beijing YoJaJa Software Technology Development Co., Ltd.Inventors: Roberto Gerson De Albuquerque Azevedo, Yuanyi Xue, Scott Labrozzi, Christopher Richard Schroers, Yang Zhang, Wenhao Zhang
-
Patent number: 12469262Abstract: In some embodiments, a method sends information for a sample of content, a first question, and a second question for output on an interface. The first question receives, from a subject, a first response for a sample level rating for an artifact that is perceived to be visible in the sample and the second question receives, from the subject, a second response for regions in the sample that are perceived to contain the artifact. The method receives the first response for the sample level rating and the second response for regions that are perceived to contain the artifact. First responses are combined from multiple subjects to generate an opinion score for the sample and second responses are combined to generate region scores for regions. The method generates training data from the opinion score and the region scores to train a process to perform an action based on the artifacts.Type: GrantFiled: April 11, 2024Date of Patent: November 11, 2025Assignees: Disney Enterprises, Inc., Beijing YoJaJa Software Technology Development Co., Ltd.Inventors: Yuanyi Xue, Scott Labrozzi, Wenhao Zhang, Christopher Richard Schroers, Roberto Gerson De Albuquerque Azevedo, Xuchang Huangfu, Lemei Huang, Yang Zhang
-
Publication number: 20250337967Abstract: A system includes a computing platform having processing hardware, and a memory storing software code. The software code is executed to receive digital content indexed to a timeline, receive insertion data identifying a timecode of the timeline, and encode the digital content using the insertion data to provide segmented content having a segment boundary at the timecode, and first and second segments adjoining the segment boundary, wherein the first segment precedes, and the second segment succeeds, the segment boundary. The software code also re-processes the first and second segments to apply a fade-out within or to the first segment and a fade-in within or to the second segment, wherein re-processing the first and second segments provides encoded segments having the segment boundary configured as an insertion point for supplemental content.Type: ApplicationFiled: July 9, 2025Publication date: October 30, 2025Inventor: Scott Labrozzi
-
Patent number: 12452477Abstract: A system includes a hardware processor and a memory storing a video/audio (V/A) synchronizer including video and audio encoders. The hardware processor executes the V/A synchronizer to receive raw video and audio extracted from media content, partition the raw video into video frame patches, partition the raw audio into audio samples, pre-process the video frame patches and the audio samples for encoding. The hardware processor further executes the V/A synchronizer to encode, using the video encoder, the pre-processed video frame patches to provide pre-processed and encoded video frame patches used to provide a latent representation of the raw video, encode, using the audio encoder, the pre-processed audio samples to provide pre-processed and encoded audio samples used to provide a latent representation of the raw audio, and synchronize, using the latent representations of the raw video and the raw audio, the raw audio with the raw video.Type: GrantFiled: May 24, 2024Date of Patent: October 21, 2025Assignee: Disney Enterprises, Inc.Inventors: Clara Fernandez Labrador, Cafer Mertcan Akcay, Christopher Richard Schroers, Joan Massich Vall, Scott Labrozzi, Mitchel Jacobs, Katherine Hinsen, Eitan Abecassis
-
Publication number: 20250280239Abstract: In some embodiments, a method analyzes a first sample of a first audio signal to determine a first representation in a space. A plurality of second samples for a second audio signal is analyzed to determine a plurality of second representations in the space. The method compares the first representation and the plurality of second representations in the space to select a second representation. An offset is determined between the first sample and a second sample that is associated with the second representation. The offset is output.Type: ApplicationFiled: July 18, 2024Publication date: September 4, 2025Applicants: Disney Enterprises, Inc., ETH Zürich (Eidgenössische Technische Hochschule Zürich)Inventors: Eitan Abecassis, David Meyer, Clara Fernandez Labrador, Christopher Richard Schroers, Scott Labrozzi
-
Patent number: 12382112Abstract: A system includes a computing platform having processing hardware, and a memory storing software code. The software code is executed to receive digital content indexed to a timeline, receive insertion data identifying a timecode of the timeline, and encode the digital content using the insertion data to provide segmented content having a segment boundary at the timecode, and first and second segments adjoining the segment boundary, wherein the first segment precedes, and the second segment succeeds, the segment boundary. The software code also re-processes the first and second segments to apply a fade-out within or to the first segment and a fade-in within or to the second segment, wherein re-processing the first and second segments provides encoded segments having the segment boundary configured as an insertion point for supplemental content.Type: GrantFiled: July 26, 2022Date of Patent: August 5, 2025Assignee: Disney Enterprises, Inc.Inventor: Scott Labrozzi
-
Patent number: 12382069Abstract: A system includes a machine learning (ML) model-based video encoder configured to receive an uncompressed video sequence including multiple video frames, determine, from among the multiple video frames, a first video frame subset and a second video frame subset, encode the first video frame subset to produce a first compressed video frame subset, and identify a first decompression data for the first compressed video frame subset. The ML model-based video encoder is further configured to encode the second video frame subset to produce a second compressed video frame subset, and identify a second decompression data for the second compressed video frame subset. The first decompression data is specific to decoding the first compressed video frame subset but not the second compressed video frame subset, and the second decompression data is specific to decoding the second compressed video frame subset but not the first compressed video frame subset.Type: GrantFiled: May 2, 2024Date of Patent: August 5, 2025Assignees: Disney Enterprises, Inc., ETH ZÜRICH (EIDGENÖSSISCHE TECHNISCHE HOCHSCHULE ZÜRICH)Inventors: Abdelaziz Djelouah, Leonhard Markus Helminger, Roberto Gerson De Albuquerque Azevedo, Christopher Richard Schroers, Scott Labrozzi, Yuanyi Xue
-
Publication number: 20250247511Abstract: In some embodiments, a method determines a disparity value from a plurality of disparity values in a current frame of a stereoscopic video. The disparity value is based on a difference of a value for a pixel between a first video and a second video of the stereoscopic video. A location is determined in a current frame that include the disparity value. The method analyzes first frames prior to the current frame to adjust disparity values in the first frames to generate one or more adjusted first disparity values. Also, the method analyzes second frames after the current frame to adjust disparity values in the second frames to generate one or more adjusted second disparity values. The one or more adjusted first disparity values and the one or more adjusted second disparity values are output for use in displaying captions in the first video or the second video.Type: ApplicationFiled: July 6, 2024Publication date: July 31, 2025Applicant: Disney Enterprises, Inc.Inventors: Yuanyi Xue, Scott Labrozzi, Eitan M. Abecassis, Chetan Mathur, Michael J. Bracco
-
Publication number: 20250220197Abstract: In some embodiments, a method receives source content. A pre-processor pre-processes the source content to output pre-processed source content. The pre-processor includes a first parameter that is trained based on a differentiable proxy codec, and a calculated adjustment to a second parameter of the differentiable proxy codec is used to train the first parameter of the pre-processor. The method encodes the pre-processed source content into compressed pre-processed source content. The compressed pre-processed source content is output.Type: ApplicationFiled: March 21, 2025Publication date: July 3, 2025Applicants: Disney Enterprises, Inc., ETH Zürich (Eidgenössische Technische Hochschule Zürich)Inventors: Yang Zhang, Mingyang Song, Christopher Richard Schroers, Tunc Ozan Aydin, Yuanyi Xue, Scott Labrozzi
-
Publication number: 20250211758Abstract: A system includes a machine learning (ML) model-based video downsampler configured to receive an input video sequence having a first display resolution, and to map the input video sequence to a lower resolution video sequence having a second display resolution lower than the first display resolution. The system also includes a neural network-based (NN-based) proxy video codec configured to transform the lower resolution video sequence into a decoded proxy bitstream. In addition, the system includes an upsampler configured to produce an output video sequence using the decoded proxy bitstream.Type: ApplicationFiled: March 12, 2025Publication date: June 26, 2025Inventors: Christopher Richard Schroers, Roberto Gerson de Albuquerque Azevedo, Nicholas David Gregory, Yuanyi Xue, Scott Labrozzi, Abdelaziz Djelouah
-
Publication number: 20250168367Abstract: A system includes a tunable neural network-based video encoder configured to receive a video sequence including multiple video frames, generate a frame-specific embedding of a first video frame of the multiple video frames, and identify one or more group-of-pictures (GOP) features of a subset of the multiple video frames, the subset of the including the first video frame. The tunable neural network-based video encoder is further configured to combine the frame-specific embedding of the first video frame and the one or more GOP features of the first plurality of the plurality of video frames to provide a latent feature corresponding to a compressed version of the first video frame.Type: ApplicationFiled: October 18, 2024Publication date: May 22, 2025Inventors: Roberto Gerson de Albuquerque Azevedo, Christopher Richard Schroers, Scott Labrozzi, Jens Eirik Saethre, Yuanyl Xue
-
Publication number: 20250157087Abstract: In some embodiments, a method receives a quantized latent representation of an image in a latent space. The image is encoded into a representation in the latent space and quantized to generate the quantized latent representation. A time step parameter is received that is generated based on the representation. The method performs an inverse quantization process to generate a reconstructed representation. A diffusion model performs a denoising process for a number of iterations based on the time step parameter to remove noise from the reconstructed representation to generate a denoised reconstructed representation. The denoised reconstructed representation is decoded into a reconstructed image.Type: ApplicationFiled: October 18, 2024Publication date: May 15, 2025Applicants: Disney Enterprises, Inc., ETH Zürich (Eidgenössische Technische Hochschule Zürich)Inventors: Lucas Relic, Roberto Gerson De Albuquerque Azevedo, Christopher Richard Schroers, Yuanyi Xue, Scott Labrozzi
-
Patent number: 12284360Abstract: In some embodiments, a method trains a first parameter of a differentiable proxy codec to encode source content based on a first loss between first compressed source content and second compressed source content that is output by a target codec. A pre-processor pre-processes a source image to output a pre-processed source image, the pre-processing being based on a second parameter. The differentiable proxy codec encodes the pre-processed source image into a compressed pre-processed source image based on the first parameter. The method determines a second loss between the source image and the compressed pre-processed source image and determines an adjustment to the first parameter based on the second loss. The adjustment is used to adjust the second parameter of the pre-processor based on the second loss.Type: GrantFiled: October 19, 2023Date of Patent: April 22, 2025Assignees: DISNEY ENTERPRISES, INC., ETH ZÜRICH (EIDGENÖSSISCHE TECHNISCHE HOCHSCHULE ZÜRICH)Inventors: Yang Zhang, Mingyang Song, Christopher Richard Schroers, Tunc Ozan Aydin, Yuanyi Xue, Scott Labrozzi
-
Publication number: 20250126309Abstract: In some embodiments, a method generates a first representation of a first relationship between bitrate and quality based on first features of a first portion of a video. The first representation is analyzed to determine a first list of potential bitrates for the first portion of video. The method analyzes potential bitrates and quality associated with the respective potential bitrates to refine the first list of potential bitrates to a second list of bitrates. The second list of bitrates includes a different list of bitrates than the first list of potential bitrates. The method outputs the second list of bitrates for encoding the first portion of video.Type: ApplicationFiled: December 20, 2024Publication date: April 17, 2025Applicants: Disney Enterprises, Inc., Beijing YoJaJa Software Technology Development Co., Ltd.Inventors: Chen Liu, Wenhao Zhang, Scott Labrozzi, Yuanyi Xue, Xuchang Huangfu, Xiaobo Liu