Patents by Inventor Aniket BERA

Aniket BERA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250014144
    Abstract: A neural operator-based architecture for performing synthetic frame generation is introduced. The architecture leverages the principles of physics to learn the features in the frames, independent of input resolution, through token mixing and global convolution in the Fourier spectral domain by using Fast Fourier Transform (FFT). The architecture overcomes one of the common limitations exhibited by models that use convolutional layers, a variance to scale, and makes the model resolution independent. This approach is particularly relevant in cases where hardware and resource limitations prevent the capture of high frame rate videos.
    Type: Application
    Filed: June 28, 2024
    Publication date: January 9, 2025
    Inventors: Aniket Bera, Rashmi Bhaskara, Md Ashiqur Rahman, Hrishikesh Viswanath
  • Publication number: 20250005925
    Abstract: A novel multi-modal audio video framework is disclosed. The framework advantageously leverages both audio and video information to accurately detect whether the video has been manipulated, e.g., detect whether the video is a so-called ‘deepfake.’ In a video-only pipeline, the framework adopts a vision encoder having a feature extractor and a Transformer encoder that leverages self-attention mechanisms to detect artifacts in a facial region of the video. Additionally, in a separate audio-video pipeline, the framework adopts an audio+lip encoder having a Transformer encoder that leverages cross-attention mechanisms to identify discrepancies between lip movements of the person and words spoken by the person in the video. These two modalities are used jointly to make an inference as to whether the video has been manipulated.
    Type: Application
    Filed: April 24, 2024
    Publication date: January 2, 2025
    Applicant: Purdue Research Foundation
    Inventors: Aniket Bera, Aaditya Kharel, Manas Aniruddha Paranjape
  • Patent number: 11861940
    Abstract: Systems, methods, apparatuses, and computer program products for recognizing human emotion in images or video. A method for recognizing perceived human emotion may include receiving a raw input. The raw input may be processed to generate input data corresponding to at least one context. Features from the raw input data may be extracted to obtain a plurality of feature vectors and inputs. The plurality of feature vectors and the inputs may be transmitted to a respective neural network. At least some of the plurality of feature vectors may be fused to obtain a feature encoding. Additional feature encodings may be computed from the plurality of feature vectors via the respective neural network. A multi-label emotion classification of a primary agent may be performed in the raw input based on the feature encoding and the additional feature encodings.
    Type: Grant
    Filed: June 16, 2021
    Date of Patent: January 2, 2024
    Assignee: UNIVERSITY OF MARYLAND, COLLEGE PARK
    Inventors: Trisha Mittal, Aniket Bera, Uttaran Bhattacharya, Rohan Chandra, Pooja Guhan, Dinesh Manocha
  • Patent number: 11830291
    Abstract: Systems, methods, apparatuses, and computer program products for providing multimodal emotion recognition. The method may include receiving raw input from an input source. The method may also include extracting one or more feature vectors from the raw input. The method may further include determining an effectiveness of the one or more feature vectors. Further, the method may include performing, based on the determination, multiplicative fusion processing on the one or more feature vectors. The method may also include predicting, based on results of the multiplicative fusion processing, one or more emotions of the input source.
    Type: Grant
    Filed: February 10, 2021
    Date of Patent: November 28, 2023
    Assignee: UNIVERSITY OF MARYLAND, COLLEGE PARK
    Inventors: Trisha Mittal, Aniket Bera, Uttaran Bhattacharya, Rohan Chandra, Dinesh Manocha
  • Publication number: 20230135769
    Abstract: Systems and methods of the present invention for gesture generation include: receiving a sequence of one or more word embeddings, one or more attributes, a gesture generation machine learning model; providing the sequence of one or more word embeddings and the one or more attributes to the gesture generation machine learning model; and providing the second emotive gesture of the virtual agent from the gesture generation machine learning model. The gesture generation machine learning model is configured to: produce, via an encoder, an output based on the one or more word embeddings; generate one or more encoded features based on the output and the one or more attributes; and produce, via a decoder, a emotive gesture based on the one or more encoded features and the preceding emotive gesture. Other aspects, embodiments, and features are also claimed and described.
    Type: Application
    Filed: October 31, 2022
    Publication date: May 4, 2023
    Inventors: Uttaran BHATTACHARYA, Aniket BERA, Dinesh MANOCHA, Abhishek BANERJEE, Pooja GUHAN, Nicholas REWKOWSKI
  • Publication number: 20220138472
    Abstract: A video is classified as real or fake by extracting facial features, including facial modalities and facial emotions, and speech features, including speech modalities and speech emotions, from the video. The facial and speech modalities are passed through first and second neural networks, respectively, to generate facial and speech modality embeddings. The facial and speech emotions are passed through third and fourth neural networks, respectively, to generate facial and speech emotion embeddings. A first distance, d1, between the facial modality embedding and the speech modality embedding is generated, together with a second distance, d2, between the facial emotion embedding and the speech emotion embedding. The video is classified as fake if a sum of the first distance and the second distance exceeds a threshold distance. The networks may be trained using real and fake video pairs for multiple subjects.
    Type: Application
    Filed: November 1, 2021
    Publication date: May 5, 2022
    Inventors: Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha
  • Publication number: 20210390288
    Abstract: Systems, methods, apparatuses, and computer program products for recognizing human emotion in images or video. A method for recognizing perceived human emotion may include receiving a raw input. The raw input may be processed to generate input data corresponding to at least one context. Features from the raw input data may be extracted to obtain a plurality of feature vectors and inputs. The plurality of feature vectors and the inputs may be transmitted to a respective neural network. At least some of the plurality of feature vectors may be fused to obtain a feature encoding. Additional feature encodings may be computed from the plurality of feature vectors via the respective neural network. A multi-label emotion classification of a primary agent may be performed in the raw input based on the feature encoding and the additional feature encodings.
    Type: Application
    Filed: June 16, 2021
    Publication date: December 16, 2021
    Inventors: Trisha MITTAL, Aniket BERA, Uttaran BHATTACHARYA, Rohan CHANDRA, Pooja GUHAN, Dinesh MANOCHA
  • Publication number: 20210342656
    Abstract: Systems, methods, apparatuses, and computer program products for providing multimodal emotion recognition. The method may include receiving raw input from an input source. The method may also include extracting one or more feature vectors from the raw input. The method may further include determining an effectiveness of the one or more feature vectors. Further, the method may include performing, based on the determination, multiplicative fusion processing on the one or more feature vectors. The method may also include predicting, based on results of the multiplicative fusion processing, one or more emotions of the input source.
    Type: Application
    Filed: February 10, 2021
    Publication date: November 4, 2021
    Inventors: Trisha MITTAL, Aniket BERA, Uttaran BHATTACHARYA, Rohan CHANDRA, Dinesh MANOCHA