Patents by Inventor Gaurav Mittal

Gaurav Mittal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Image generation using adversarial attacks for imbalanced datasets

Patent number: 12299082

Abstract: A method of balancing a dataset for a machine learning model includes identifying confusing classes of few-shot classes for a machine learning model during validation. One of the confusing classes and an image from one of the few-shot classes are selected. An image perturbation is computed such that the selected image is classified as the selected confusing class. The selected image is modified with the computed perturbation. The modified selected image is added to a batch for training the machine learning model.

Type: Grant

Filed: March 7, 2024

Date of Patent: May 13, 2025

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Gaurav Mittal, Nikolaos Karianakis, Victor Manuel Fragoso Rojas, Mei Chen, Jedrzej Jakub Kozerawski
Adaptive content delivery network

Patent number: 12284405

Abstract: A content delivery network (100) for streaming digital video content across a data network. The content delivery network (100) is configured to receive digital video content. The content delivery network is configured to store the digital video content in a storage format comprising a base layer (B) and an enhancement layer (E), wherein the base layer (B) is decodable to present the digital video content at a base level of video reproduction quality, and the enhancement layer (E) is decodable with the base layer to present the digital video content at an enhanced level of video reproduction quality which is higher than the base level of reproduction quality. The content delivery network (100) is configured to determine, based on a target quality which is to be provided to a client device, which layers to use in order to achieve the target quality; and to use the determined layers (B, E) to provide the client device with the digital content at the target level of quality.

Type: Grant

Filed: March 10, 2021

Date of Patent: April 22, 2025

Assignee: V-NOVA INTERNATIONAL LIMITED

Inventors: Gaurav Mittal, Simone Ferrara, Guido Meardi
Apparatuses, methods, computer programs and computer-readable media

Patent number: 12273543

Abstract: A set of reconstruction elements useable to reconstruct a representation of a signal at a relatively high level of quality using data based on a representation of the signal at a relatively low level of quality is obtained. The representation at the relatively high level of quality is arranged as an array comprising at least first and second rows of signal elements. A reconstruction element is associated with a respective signal element in the set. A set of data elements is derived based on the set of reconstruction elements. At least one of the data elements is derived from at least two reconstruction elements associated with signal elements from the first row and a different number of reconstruction elements associated with signal elements from the second row.

Type: Grant

Filed: December 19, 2023

Date of Patent: April 8, 2025

Assignee: V-NOVA INTERNATIONAL LIMITED

Inventors: Ivan Damnjanovic, Gaurav Mittal
Decoder devices, methods and computer programs

Patent number: 12219160

Abstract: A medical telepresence system comprising: an interface to receive a plurality of data feeds from a live medical procedure, at least one data feed comprising a video signal capturing the live medical procedure; a hierarchical encoder to encode the plurality of data feeds using a first tier-based hierarchical data coding scheme, wherein encoded data from the hierarchical encoder is decodable by a first set of computing devices for viewing, the first set of computing devices being communicatively coupled to the hierarchical encoder using a first network connection; a transcoder to convert from the first tier-based hierarchical data coding scheme to a second tier-based hierarchical data coding scheme, wherein encoded data from the transcoder is receivable by a second set of computing devices for viewing, the second set of computing devices being communicatively coupled to the transcoder using a second network connection, the second network connection being of a lower quality than the first network connection; and

Type: Grant

Filed: January 30, 2023

Date of Patent: February 4, 2025

Assignee: V-NOVA INTERNATIONAL LIMITED

Inventors: Guido Meardi, Simone Ferrara, Gaurav Mittal
Adaptive video quality

Patent number: 12219159

Abstract: A method for encoding a first stream of video data comprising a plurality of frames of video, the method, for one or more of the plurality of frames of video, comprising the steps of: encoding in a hierarchical arrangement a frame of the video data, the hierarchical arrangement comprising a base layer of video data and a first enhancement layer of video data, said first enhancement layer of video data comprising a plurality of sub-layers of enhancement data, such that when encoded: the base layer of video data comprises data which when decoded renders the frame at a first, base, level of quality; and each sub-layer of enhancement data comprises data which, when decoded with the base layer, render the frame at a higher level of quality than the base level of quality; and wherein the steps of encoding the sub-layers of enhancement data comprises: quantizing the enhancement data at a determined initial level of quantization thereby creating a set of quantized enhancement data; associating to each of the pluralit

Type: Grant

Filed: January 20, 2023

Date of Patent: February 4, 2025

Assignee: V-NOVA INTERNATIONAL LIMITED

Inventor: Gaurav Mittal
Video frame action detection using gated history

Patent number: 12192543

Abstract: Example solutions for video frame action detection use a gated history and include: receiving a video stream comprising a plurality of video frames; grouping the plurality of video frames into a set of present video frames and a set of historical video frames, the set of present video frames comprising a current video frame; determining a set of attention weights for the set of historical video frames, the set of attention weights indicating how informative a video frame is for predicting action in the current video frame; weighting the set of historical video frames with the set of attention weights to produce a set of weighted historical video frames; and based on at least the set of weighted historical video frames and the set of present video frames, generating an action prediction for the current video frame.

Type: Grant

Filed: December 21, 2023

Date of Patent: January 7, 2025

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Gaurav Mittal, Ye Yu, Mei Chen, Junwen Chen
PRIOR-DRIVEN SUPERVISION FOR WEAKLY-SUPERVISED TEMPORAL ACTION LOCALIZATION

Publication number: 20240404279

Abstract: A classifier model is trained for temporal action localization of video clips. A training video clip that includes actions of interest for identification is ingested into the classifier model. Action characteristics within frames of the video clip are identified. The actions correspond to known action classes. An actionness score is determined for each of the frames based upon the action characteristics identified within each of the frames. Class activation sequence (CAS) scores are determined for sequences of the frames based upon a presence or an absence of the action characteristics identified within each of the frames. Base confidence predictions of temporal locations of actions of interest within the video clip are produced by correlating each of the actionness scores with corresponding class activation scores for each of the frames in the sequences of frames.

Type: Application

Filed: May 30, 2023

Publication date: December 5, 2024

Inventors: Gaurav MITTAL, Ye YU, Matthew Brigham HALL, Sandra SAJEEV, Mei CHEN, Mamshad Nayeem RIZVE
APPARATUSES, METHODS, COMPUTER PROGRAMS AND COMPUTER-READABLE MEDIA

Publication number: 20240397068

Abstract: A set of reconstruction elements useable to reconstruct a representation of a signal at a relatively high level of quality using data based on a representation of the signal at a relatively low level of quality is obtained. The representation at the relatively high level of quality is arranged as an array comprising at least first and second rows of signal elements. A reconstruction element is associated with a respective signal element in the set. A set of data elements is derived based on the set of reconstruction elements. At least one of the data elements is derived from at least two reconstruction elements associated with signal elements from the first row and a different number of reconstruction elements associated with signal elements from the second row.

Type: Application

Filed: December 19, 2023

Publication date: November 28, 2024

Inventors: Ivan DAMNJANOVIC, Gaurav MITTAL
Leveraging unsupervised meta-learning to boost few-shot action recognition

Patent number: 12087043

Abstract: The disclosure herein describes preparing and using a cross-attention model for action recognition using pre-trained encoders and novel class fine-tuning. Training video data is transformed into augmented training video segments, which are used to train an appearance encoder and an action encoder. The appearance encoder is trained to encode video segments based on spatial semantics and the action encoder is trained to encode video segments based on spatio-temporal semantics. A set of hard-mined training episodes are generated using the trained encoders. The cross-attention module is then trained for action-appearance aligned classification using the hard-mined training episodes. Then, support video segments are obtained, wherein each support video segment is associated with video classes. The cross-attention module is fine-tuned using the obtained support video segments and the associated video classes.

Type: Grant

Filed: November 24, 2021

Date of Patent: September 10, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Gaurav Mittal, Ye Yu, Mei Chen, Jay Sanjay Patravali
TRAINING AND USING A MODEL FOR CONTENT MODERATION OF MULTIMODAL MEDIA

Publication number: 20240290081

Abstract: A computerized method trains and uses a multimodal fusion transformer (MFT) model for content moderation. Language modality data and vision modality data associated with a multimodal media source is received. Language embeddings are generated from the language modality data and vision embeddings are generated from the vision modality data. Both kinds of embeddings are generated using operations and/or processes that are specific to the associated modalities. The language embeddings and vision embeddings are combined into combined embeddings and the MFT model is used with those combined embeddings to generate a language semantic output token, a vision semantic output token, and a combined semantic output token. Contrastive loss data is generated using the three semantic output tokens and the MFT model is adjusted using that contrastive loss data. After the MFT model is trained sufficiently, it is configured to perform content moderation operations using semantic output tokens.

Type: Application

Filed: February 28, 2023

Publication date: August 29, 2024

Inventors: Ye YU, Gaurav MITTAL, Matthew Brigham HALL, Sandra SAJEEV, Mei CHEN, Jialin YUAN
VIDEO FRAME ACTION DETECTION USING GATED HISTORY

Publication number: 20240244279

Abstract: Example solutions for video frame action detection use a gated history and include: receiving a video stream comprising a plurality of video frames; grouping the plurality of video frames into a set of present video frames and a set of historical video frames, the set of present video frames comprising a current video frame; determining a set of attention weights for the set of historical video frames, the set of attention weights indicating how informative a video frame is for predicting action in the current video frame; weighting the set of historical video frames with the set of attention weights to produce a set of weighted historical video frames; and based on at least the set of weighted historical video frames and the set of present video frames, generating an action prediction for the current video frame.

Type: Application

Filed: December 21, 2023

Publication date: July 18, 2024

Inventors: Gaurav MITTAL, Ye YU, Mei CHEN, Junwen CHEN
IMAGE GENERATION USING ADVERSARIAL ATTACKS FOR IMBALANCED DATASETS

Publication number: 20240211547

Abstract: A method of balancing a dataset for a machine learning model includes identifying confusing classes of few-shot classes for a machine learning model during validation. One of the confusing classes and an image from one of the few-shot classes are selected. An image perturbation is computed such that the selected image is classified as the selected confusing class. The selected image is modified with the computed perturbation. The modified selected image is added to a batch for training the machine learning model.

Type: Application

Filed: March 7, 2024

Publication date: June 27, 2024

Inventors: Gaurav MITTAL, Nikolaos KARIANAKIS, Victor Manuel FRAGOSO ROJAS, Mei CHEN, Jedrzej Jakub KOZERAWSKI
Image generation using adversarial attacks for imbalanced datasets

Patent number: 11960574

Abstract: A method of balancing a dataset for a machine learning model includes identifying confusing classes of few-shot classes for a machine learning model during validation. One of the confusing classes and an image from one of the few-shot classes are selected. An image perturbation is computed such that the selected image is classified as the selected confusing class. The selected image is modified with the computed perturbation. The modified selected image is added to a batch for training the machine learning model.

Type: Grant

Filed: June 28, 2021

Date of Patent: April 16, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Gaurav Mittal, Nikolaos Karianakis, Victor Manuel Fragoso Rojas, Mei Chen, Jedrzej Jakub Kozerawski
Video frame action detection using gated history

Patent number: 11895343

Abstract: Example solutions for video frame action detection use a gated history and include: receiving a video stream comprising a plurality of video frames; grouping the plurality of video frames into a set of present video frames and a set of historical video frames, the set of present video frames comprising a current video frame; determining a set of attention weights for the set of historical video frames, the set of attention weights indicating how informative a video frame is for predicting action in the current video frame; weighting the set of historical video frames with the set of attention weights to produce a set of weighted historical video frames; and based on at least the set of weighted historical video frames and the set of present video frames, generating an action prediction for the current video frame.

Type: Grant

Filed: June 28, 2022

Date of Patent: February 6, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Gaurav Mittal, Ye Yu, Mei Chen, Junwen Chen
BILATERAL ATTENTION TRANSFORMER IN MOTION-APPEARANCE NEIGHBORING SPACE FOR VIDEO OBJECT SEGMENTATION

Publication number: 20240020854

Abstract: Example solutions for video object segmentation (VOS) use a bilateral attention transformer in motion-appearance neighboring space, and perform a process that includes: receiving a video stream comprising a plurality of video frames in a sequence; receiving a first object mask for an initial video frame of the plurality of video frames; selecting a video frame of the plurality of video frames as a current query frame, the current query frame following, in the sequence, a reference frame of a reference frame set, wherein each reference frame has a corresponding object mask; using the current query frame and a video frame in the reference frame set, determining a bilateral attention; and using the bilateral attention, generating an object mask for the current query frame.

Type: Application

Filed: September 14, 2022

Publication date: January 18, 2024

Inventors: Ye YU, Gaurav MITTAL, Mei CHEN, Jialin YUAN
Adaptive video consumption

Patent number: 11877019

Abstract: A video streaming client is configured to check whether a target version of a desired video content is available for streaming from a video streaming server, the target version being encoded to a target value of an encoding attribute. The video streaming client obtains a data communication speed to the video streaming server, and determines that the data communication speed is sufficient to stream and display the target version of the desired video content. The target value is less than a maximum value of the encoding attribute which is decodable by the video streaming client. The video streaming client is configured to select to stream the target version of the desired video content even though the data communication speed is sufficient to stream a version of the desired video content without playback interruption when encoded using a value of the encoding attribute which is higher than the target value.

Type: Grant

Filed: November 16, 2020

Date of Patent: January 16, 2024

Inventor: Gaurav Mittal
Apparatuses, methods, computer programs and computer-readable media

Patent number: 11856210

Abstract: A set of reconstruction elements useable to reconstruct a representation of a signal at a relatively high level of quality using data based on a representation of the signal at a relatively low level of quality is obtained. The representation at the relatively high level of quality is arranged as an array comprising at least first and second rows of signal elements. A reconstruction element is associated with a respective signal element in the set. A set of data elements is derived based on the set of reconstruction elements. At least one of the data elements is derived from at least two reconstruction elements associated with signal elements from the first row and a different number of reconstruction elements associated with signal elements from the second row.

Type: Grant

Filed: June 9, 2021

Date of Patent: December 26, 2023

Inventors: Ivan Damnjanovic, Gaurav Mittal
VIDEO FRAME ACTION DETECTION USING GATED HISTORY

Publication number: 20230396817

Abstract: Example solutions for video frame action detection use a gated history and include: receiving a video stream comprising a plurality of video frames; grouping the plurality of video frames into a set of present video frames and a set of historical video frames, the set of present video frames comprising a current video frame; determining a set of attention weights for the set of historical video frames, the set of attention weights indicating how informative a video frame is for predicting action in the current video frame; weighting the set of historical video frames with the set of attention weights to produce a set of weighted historical video frames; and based on at least the set of weighted historical video frames and the set of present video frames, generating an action prediction for the current video frame.

Type: Application

Filed: June 28, 2022

Publication date: December 7, 2023

Inventors: Gaurav MITTAL, Ye YU, Mei CHEN, Junwen CHEN
Decoder devices, methods and computer programs

Publication number: 20230336755

Abstract: A medical telepresence system comprising: an interface to receive a plurality of data feeds from a live medical procedure, at least one data feed comprising a video signal capturing the live medical procedure; a hierarchical encoder to encode the plurality of data feeds using a first tier-based hierarchical data coding scheme, wherein encoded data from the hierarchical encoder is decodable by a first set of computing devices for viewing, the first set of computing devices being communicatively coupled to the hierarchical encoder using a first network connection; a transcoder to convert from the first tier-based hierarchical data coding scheme to a second tier-based hierarchical data coding scheme, wherein encoded data from the transcoder is receivable by a second set of computing devices for viewing, the second set of computing devices being communicatively coupled to the transcoder using a second network connection, the second network connection being of a lower quality than the first network connection; and

Type: Application

Filed: January 30, 2023

Publication date: October 19, 2023

Inventors: Guido MEARDI, Simone Ferrara, Gaurav Mittal
ADAPTIVE VIDEO QUALITY

Publication number: 20230156204

Abstract: A method for encoding a first stream of video data comprising a plurality of frames of video, the method, for one or more of the plurality of frames of video, comprising the steps of: encoding in a hierarchical arrangement a frame of the video data, the hierarchical arrangement comprising a base layer of video data and a first enhancement layer of video data, said first enhancement layer of video data comprising a plurality of sub-layers of enhancement data, such that when encoded: the base layer of video data comprises data which when decoded renders the frame at a first, base, level of quality; and each sub-layer of enhancement data comprises data which, when decoded with the base layer, render the frame at a higher level of quality than the base level of quality; and wherein the steps of encoding the sub-layers of enhancement data comprises: quantizing the enhancement data at a determined initial level of quantization thereby creating a set of quantized enhancement data; associating to each of the pluralit

Type: Application

Filed: January 20, 2023

Publication date: May 18, 2023

Inventor: Gaurav MITTAL

1 2 3 4 5 next