Patents by Inventor Scott Labrozzi

Scott Labrozzi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12634488
    Abstract: A system includes a tunable neural network-based video encoder configured to receive a video sequence including multiple video frames, generate a frame-specific embedding of a first video frame of the multiple video frames, and identify one or more group-of-pictures (GOP) features of a subset of the multiple video frames, the subset of the including the first video frame. The tunable neural network-based video encoder is further configured to combine the frame-specific embedding of the first video frame and the one or more GOP features of the first plurality of the plurality of video frames to provide a latent feature corresponding to a compressed version of the first video frame.
    Type: Grant
    Filed: October 18, 2024
    Date of Patent: May 19, 2026
    Assignees: Disney Enterprises, Inc., ETH ZÜRICH (EIDGENÖSSISCHE TECHNISCHE HOCHSCHULE ZÜRICH)
    Inventors: Roberto Gerson de Albuquerque Azevedo, Christopher Richard Schroers, Scott Labrozzi, Jens Eirik Saethre, Yuanyì Xue
  • Patent number: 12621502
    Abstract: A system processing hardware executes a machine learning (ML) model-based video compression encoder to receive uncompressed video content and corresponding motion compensated video content, compare the uncompressed and motion compensated video content to identify an image space residual, transform the image space residual to a latent space representation of the uncompressed video content, and transform, using a trained image compression ML model, the motion compensated video content to a latent space representation of the motion compensated video content.
    Type: Grant
    Filed: September 13, 2024
    Date of Patent: May 5, 2026
    Assignees: Disney Enterprises, Inc., ETH Zürich (EIDGENÖSSISCHE TECHNISCHE HOCHSCHULE ZÜRICH)
    Inventors: Abdelaziz Djelouah, Leonhard Markus Helminger, Roberto Gerson de Albuquerque Azevedo, Scott Labrozzi, Christopher Richard Schroers, Yuanyi Xue
  • Patent number: 12621506
    Abstract: In some embodiments, a method analyzes flagged locations from a plurality of locations in an encoding of a video to form a cluster of locations. Draft micro-chunk boundaries for the cluster are determined based on searching for a first start location and a first end location in the encoding. The method searches in a first search range before the first start location and a second search range after the first end location for a second start location in the first search range and a second end location in the second search range. The second start location and the second end location form a micro-chunk. An encoding parameter set is determined for the micro-chunk formed by the second start location and the second end location based on content characteristics of the micro-chunk. The method uses the encoding parameter set to encode the micro-chunk for insertion in the encoding of the video.
    Type: Grant
    Filed: September 25, 2023
    Date of Patent: May 5, 2026
    Assignees: Disney Enterprises, Inc., Beijing YoJaJa Software Technology Development Co., Ltd.
    Inventors: Yuanyi Xue, Roberto Gerson De Albuquerque Azevedo, Christopher Richard Schroers, Scott Labrozzi, Wenhao Zhang
  • Publication number: 20260095599
    Abstract: In some embodiments, a method receives an image and encodes the image into a latent representation in a latent space. A quantization process is performed on the latent representation to generate a quantized latent representation. The quantization process is based on a uniform noise. The method transmits the quantized latent representation to a receiver. An inverse quantization process is performed to generate a reconstructed latent representation via a diffusion model that performs a denoising process for a number of iterations based on a time step t to remove noise from the reconstructed latent representation. The diffusion model is trained to perform denoising using the uniform noise.
    Type: Application
    Filed: February 18, 2025
    Publication date: April 2, 2026
    Applicants: Disney Enterprises, Inc., ETH Zürich (Eidgenössische Technische Hochschule Zürich)
    Inventors: Lucas Relic, Roberto Gerson De Albuquerque Azevedo, Yang Zhang, Christopher Richard Schroers, Yuanyi Xue, Scott Labrozzi
  • Patent number: 12587654
    Abstract: A method receives a video. The method analyzes information for a pixel of a frame in the video to determine a first value and a second value for the pixel. The first value is based on an image structure formed by the pixel in the frame and the second value is based on interframe motion of the image structure at the pixel. A third value is determined for an amount of judder based on the first value and the second value. The method outputs the third value to evaluate the video.
    Type: Grant
    Filed: February 5, 2024
    Date of Patent: March 24, 2026
    Assignee: Disney Enterprises, Inc.
    Inventors: Christopher Richard Schroers, Blake Sloan, Mitchel Jacobs, Scott Labrozzi, Shinobu Hattori, Felix Klose
  • Publication number: 20260039898
    Abstract: A system includes a hardware processor and a memory storing a video/audio (V/A) synchronizer including video and audio encoders. The hardware processor executes the V/A synchronizer to receive raw video and audio extracted from media content, partition the raw video into video frame patches, partition the raw audio into audio samples, pre-process the video frame patches and the audio samples for encoding. The hardware processor further executes the V/A synchronizer to encode, using the video encoder, the pre-processed video frame patches to provide pre-processed and encoded video frame patches used to provide a latent representation of the raw video, encode, using the audio encoder, the pre-processed audio samples to provide pre-processed and encoded audio samples used to provide a latent representation of the raw audio, and synchronize, using the latent representations of the raw video and the raw audio, the raw audio with the raw video.
    Type: Application
    Filed: September 23, 2025
    Publication date: February 5, 2026
    Inventors: Clara Fernandez Labrador, Cafer Mertcan Akcay, Christopher Richard Schroers, Joan Massich Vall, Scott Labrozzi, Mitchel Jacobs, Katherine Hinsen, Eitan Abecassis
  • Publication number: 20250386026
    Abstract: In some embodiments, a method determines an instance of content and a metric to evaluate a quality of an encoding of the instance of content. A set of features is extracted. The method performs an optimized search process to evaluate different combinations of encoding parameter values that are used to encode the content to generate instances of encoded content. The instances of encoded content are compared to the metric to determine a next combination of encoding parameter values to use. An optimal combination of encoding parameter values is selected that is associated with one of the instances of encoded content. Predicted encoding parameter values are output from a model using model parameters based on an input of the set of features. The method is trained using the optimal combination of encoding parameter values and the predicted encoding parameter values, wherein the model parameters are adjusted in the training.
    Type: Application
    Filed: September 16, 2024
    Publication date: December 18, 2025
    Applicants: Disney Enterprises, Inc., Beijing YoJaJa Software Technology Development Co., Ltd.
    Inventors: Roberto Gerson De Albuquerque Azevedo, Yuanyi Xue, Scott Labrozzi, Christopher Richard Schroers, Yang Zhang, Wenhao Zhang
  • Patent number: 12469262
    Abstract: In some embodiments, a method sends information for a sample of content, a first question, and a second question for output on an interface. The first question receives, from a subject, a first response for a sample level rating for an artifact that is perceived to be visible in the sample and the second question receives, from the subject, a second response for regions in the sample that are perceived to contain the artifact. The method receives the first response for the sample level rating and the second response for regions that are perceived to contain the artifact. First responses are combined from multiple subjects to generate an opinion score for the sample and second responses are combined to generate region scores for regions. The method generates training data from the opinion score and the region scores to train a process to perform an action based on the artifacts.
    Type: Grant
    Filed: April 11, 2024
    Date of Patent: November 11, 2025
    Assignees: Disney Enterprises, Inc., Beijing YoJaJa Software Technology Development Co., Ltd.
    Inventors: Yuanyi Xue, Scott Labrozzi, Wenhao Zhang, Christopher Richard Schroers, Roberto Gerson De Albuquerque Azevedo, Xuchang Huangfu, Lemei Huang, Yang Zhang
  • Publication number: 20250337967
    Abstract: A system includes a computing platform having processing hardware, and a memory storing software code. The software code is executed to receive digital content indexed to a timeline, receive insertion data identifying a timecode of the timeline, and encode the digital content using the insertion data to provide segmented content having a segment boundary at the timecode, and first and second segments adjoining the segment boundary, wherein the first segment precedes, and the second segment succeeds, the segment boundary. The software code also re-processes the first and second segments to apply a fade-out within or to the first segment and a fade-in within or to the second segment, wherein re-processing the first and second segments provides encoded segments having the segment boundary configured as an insertion point for supplemental content.
    Type: Application
    Filed: July 9, 2025
    Publication date: October 30, 2025
    Inventor: Scott Labrozzi
  • Patent number: 12452477
    Abstract: A system includes a hardware processor and a memory storing a video/audio (V/A) synchronizer including video and audio encoders. The hardware processor executes the V/A synchronizer to receive raw video and audio extracted from media content, partition the raw video into video frame patches, partition the raw audio into audio samples, pre-process the video frame patches and the audio samples for encoding. The hardware processor further executes the V/A synchronizer to encode, using the video encoder, the pre-processed video frame patches to provide pre-processed and encoded video frame patches used to provide a latent representation of the raw video, encode, using the audio encoder, the pre-processed audio samples to provide pre-processed and encoded audio samples used to provide a latent representation of the raw audio, and synchronize, using the latent representations of the raw video and the raw audio, the raw audio with the raw video.
    Type: Grant
    Filed: May 24, 2024
    Date of Patent: October 21, 2025
    Assignee: Disney Enterprises, Inc.
    Inventors: Clara Fernandez Labrador, Cafer Mertcan Akcay, Christopher Richard Schroers, Joan Massich Vall, Scott Labrozzi, Mitchel Jacobs, Katherine Hinsen, Eitan Abecassis
  • Publication number: 20250280239
    Abstract: In some embodiments, a method analyzes a first sample of a first audio signal to determine a first representation in a space. A plurality of second samples for a second audio signal is analyzed to determine a plurality of second representations in the space. The method compares the first representation and the plurality of second representations in the space to select a second representation. An offset is determined between the first sample and a second sample that is associated with the second representation. The offset is output.
    Type: Application
    Filed: July 18, 2024
    Publication date: September 4, 2025
    Applicants: Disney Enterprises, Inc., ETH Zürich (Eidgenössische Technische Hochschule Zürich)
    Inventors: Eitan Abecassis, David Meyer, Clara Fernandez Labrador, Christopher Richard Schroers, Scott Labrozzi
  • Patent number: 12382112
    Abstract: A system includes a computing platform having processing hardware, and a memory storing software code. The software code is executed to receive digital content indexed to a timeline, receive insertion data identifying a timecode of the timeline, and encode the digital content using the insertion data to provide segmented content having a segment boundary at the timecode, and first and second segments adjoining the segment boundary, wherein the first segment precedes, and the second segment succeeds, the segment boundary. The software code also re-processes the first and second segments to apply a fade-out within or to the first segment and a fade-in within or to the second segment, wherein re-processing the first and second segments provides encoded segments having the segment boundary configured as an insertion point for supplemental content.
    Type: Grant
    Filed: July 26, 2022
    Date of Patent: August 5, 2025
    Assignee: Disney Enterprises, Inc.
    Inventor: Scott Labrozzi
  • Patent number: 12382069
    Abstract: A system includes a machine learning (ML) model-based video encoder configured to receive an uncompressed video sequence including multiple video frames, determine, from among the multiple video frames, a first video frame subset and a second video frame subset, encode the first video frame subset to produce a first compressed video frame subset, and identify a first decompression data for the first compressed video frame subset. The ML model-based video encoder is further configured to encode the second video frame subset to produce a second compressed video frame subset, and identify a second decompression data for the second compressed video frame subset. The first decompression data is specific to decoding the first compressed video frame subset but not the second compressed video frame subset, and the second decompression data is specific to decoding the second compressed video frame subset but not the first compressed video frame subset.
    Type: Grant
    Filed: May 2, 2024
    Date of Patent: August 5, 2025
    Assignees: Disney Enterprises, Inc., ETH ZÜRICH (EIDGENÖSSISCHE TECHNISCHE HOCHSCHULE ZÜRICH)
    Inventors: Abdelaziz Djelouah, Leonhard Markus Helminger, Roberto Gerson De Albuquerque Azevedo, Christopher Richard Schroers, Scott Labrozzi, Yuanyi Xue
  • Publication number: 20250247511
    Abstract: In some embodiments, a method determines a disparity value from a plurality of disparity values in a current frame of a stereoscopic video. The disparity value is based on a difference of a value for a pixel between a first video and a second video of the stereoscopic video. A location is determined in a current frame that include the disparity value. The method analyzes first frames prior to the current frame to adjust disparity values in the first frames to generate one or more adjusted first disparity values. Also, the method analyzes second frames after the current frame to adjust disparity values in the second frames to generate one or more adjusted second disparity values. The one or more adjusted first disparity values and the one or more adjusted second disparity values are output for use in displaying captions in the first video or the second video.
    Type: Application
    Filed: July 6, 2024
    Publication date: July 31, 2025
    Applicant: Disney Enterprises, Inc.
    Inventors: Yuanyi Xue, Scott Labrozzi, Eitan M. Abecassis, Chetan Mathur, Michael J. Bracco
  • Publication number: 20250220197
    Abstract: In some embodiments, a method receives source content. A pre-processor pre-processes the source content to output pre-processed source content. The pre-processor includes a first parameter that is trained based on a differentiable proxy codec, and a calculated adjustment to a second parameter of the differentiable proxy codec is used to train the first parameter of the pre-processor. The method encodes the pre-processed source content into compressed pre-processed source content. The compressed pre-processed source content is output.
    Type: Application
    Filed: March 21, 2025
    Publication date: July 3, 2025
    Applicants: Disney Enterprises, Inc., ETH Zürich (Eidgenössische Technische Hochschule Zürich)
    Inventors: Yang Zhang, Mingyang Song, Christopher Richard Schroers, Tunc Ozan Aydin, Yuanyi Xue, Scott Labrozzi
  • Publication number: 20250211758
    Abstract: A system includes a machine learning (ML) model-based video downsampler configured to receive an input video sequence having a first display resolution, and to map the input video sequence to a lower resolution video sequence having a second display resolution lower than the first display resolution. The system also includes a neural network-based (NN-based) proxy video codec configured to transform the lower resolution video sequence into a decoded proxy bitstream. In addition, the system includes an upsampler configured to produce an output video sequence using the decoded proxy bitstream.
    Type: Application
    Filed: March 12, 2025
    Publication date: June 26, 2025
    Inventors: Christopher Richard Schroers, Roberto Gerson de Albuquerque Azevedo, Nicholas David Gregory, Yuanyi Xue, Scott Labrozzi, Abdelaziz Djelouah
  • Publication number: 20250168367
    Abstract: A system includes a tunable neural network-based video encoder configured to receive a video sequence including multiple video frames, generate a frame-specific embedding of a first video frame of the multiple video frames, and identify one or more group-of-pictures (GOP) features of a subset of the multiple video frames, the subset of the including the first video frame. The tunable neural network-based video encoder is further configured to combine the frame-specific embedding of the first video frame and the one or more GOP features of the first plurality of the plurality of video frames to provide a latent feature corresponding to a compressed version of the first video frame.
    Type: Application
    Filed: October 18, 2024
    Publication date: May 22, 2025
    Inventors: Roberto Gerson de Albuquerque Azevedo, Christopher Richard Schroers, Scott Labrozzi, Jens Eirik Saethre, Yuanyl Xue
  • Publication number: 20250157087
    Abstract: In some embodiments, a method receives a quantized latent representation of an image in a latent space. The image is encoded into a representation in the latent space and quantized to generate the quantized latent representation. A time step parameter is received that is generated based on the representation. The method performs an inverse quantization process to generate a reconstructed representation. A diffusion model performs a denoising process for a number of iterations based on the time step parameter to remove noise from the reconstructed representation to generate a denoised reconstructed representation. The denoised reconstructed representation is decoded into a reconstructed image.
    Type: Application
    Filed: October 18, 2024
    Publication date: May 15, 2025
    Applicants: Disney Enterprises, Inc., ETH Zürich (Eidgenössische Technische Hochschule Zürich)
    Inventors: Lucas Relic, Roberto Gerson De Albuquerque Azevedo, Christopher Richard Schroers, Yuanyi Xue, Scott Labrozzi
  • Patent number: 12284360
    Abstract: In some embodiments, a method trains a first parameter of a differentiable proxy codec to encode source content based on a first loss between first compressed source content and second compressed source content that is output by a target codec. A pre-processor pre-processes a source image to output a pre-processed source image, the pre-processing being based on a second parameter. The differentiable proxy codec encodes the pre-processed source image into a compressed pre-processed source image based on the first parameter. The method determines a second loss between the source image and the compressed pre-processed source image and determines an adjustment to the first parameter based on the second loss. The adjustment is used to adjust the second parameter of the pre-processor based on the second loss.
    Type: Grant
    Filed: October 19, 2023
    Date of Patent: April 22, 2025
    Assignees: DISNEY ENTERPRISES, INC., ETH ZÜRICH (EIDGENÖSSISCHE TECHNISCHE HOCHSCHULE ZÜRICH)
    Inventors: Yang Zhang, Mingyang Song, Christopher Richard Schroers, Tunc Ozan Aydin, Yuanyi Xue, Scott Labrozzi
  • Publication number: 20250126309
    Abstract: In some embodiments, a method generates a first representation of a first relationship between bitrate and quality based on first features of a first portion of a video. The first representation is analyzed to determine a first list of potential bitrates for the first portion of video. The method analyzes potential bitrates and quality associated with the respective potential bitrates to refine the first list of potential bitrates to a second list of bitrates. The second list of bitrates includes a different list of bitrates than the first list of potential bitrates. The method outputs the second list of bitrates for encoding the first portion of video.
    Type: Application
    Filed: December 20, 2024
    Publication date: April 17, 2025
    Applicants: Disney Enterprises, Inc., Beijing YoJaJa Software Technology Development Co., Ltd.
    Inventors: Chen Liu, Wenhao Zhang, Scott Labrozzi, Yuanyi Xue, Xuchang Huangfu, Xiaobo Liu