Patents by Inventor Dinesh Manocha

Dinesh Manocha has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11861940
    Abstract: Systems, methods, apparatuses, and computer program products for recognizing human emotion in images or video. A method for recognizing perceived human emotion may include receiving a raw input. The raw input may be processed to generate input data corresponding to at least one context. Features from the raw input data may be extracted to obtain a plurality of feature vectors and inputs. The plurality of feature vectors and the inputs may be transmitted to a respective neural network. At least some of the plurality of feature vectors may be fused to obtain a feature encoding. Additional feature encodings may be computed from the plurality of feature vectors via the respective neural network. A multi-label emotion classification of a primary agent may be performed in the raw input based on the feature encoding and the additional feature encodings.
    Type: Grant
    Filed: June 16, 2021
    Date of Patent: January 2, 2024
    Assignee: UNIVERSITY OF MARYLAND, COLLEGE PARK
    Inventors: Trisha Mittal, Aniket Bera, Uttaran Bhattacharya, Rohan Chandra, Pooja Guhan, Dinesh Manocha
  • Patent number: 11830291
    Abstract: Systems, methods, apparatuses, and computer program products for providing multimodal emotion recognition. The method may include receiving raw input from an input source. The method may also include extracting one or more feature vectors from the raw input. The method may further include determining an effectiveness of the one or more feature vectors. Further, the method may include performing, based on the determination, multiplicative fusion processing on the one or more feature vectors. The method may also include predicting, based on results of the multiplicative fusion processing, one or more emotions of the input source.
    Type: Grant
    Filed: February 10, 2021
    Date of Patent: November 28, 2023
    Assignee: UNIVERSITY OF MARYLAND, COLLEGE PARK
    Inventors: Trisha Mittal, Aniket Bera, Uttaran Bhattacharya, Rohan Chandra, Dinesh Manocha
  • Publication number: 20230135769
    Abstract: Systems and methods of the present invention for gesture generation include: receiving a sequence of one or more word embeddings, one or more attributes, a gesture generation machine learning model; providing the sequence of one or more word embeddings and the one or more attributes to the gesture generation machine learning model; and providing the second emotive gesture of the virtual agent from the gesture generation machine learning model. The gesture generation machine learning model is configured to: produce, via an encoder, an output based on the one or more word embeddings; generate one or more encoded features based on the output and the one or more attributes; and produce, via a decoder, a emotive gesture based on the one or more encoded features and the preceding emotive gesture. Other aspects, embodiments, and features are also claimed and described.
    Type: Application
    Filed: October 31, 2022
    Publication date: May 4, 2023
    Inventors: Uttaran BHATTACHARYA, Aniket BERA, Dinesh MANOCHA, Abhishek BANERJEE, Pooja GUHAN, Nicholas REWKOWSKI
  • Publication number: 20220343917
    Abstract: Methods and systems for far-field speech recognition are disclosed. The methods and systems include receiving multiple noisy speech samples at a target scene; generating multiple labeled vectors and multiple intermediate samples based on the multiple labeled vectors; and determining multiple pair-wise distances between each intermediate sample and each vector of a full set of acoustic impulse responses (AIRs). In some instances, such methods may further include selecting a subset of the full set of AIRs based on the multiple pair-wise distances; and training a deep learning model based on the subset of the full set of AIRs. In other instances, such methods may further include obtaining a deep-learning model trained with a dataset having similar acoustic characteristics to the noisy speech samples; and performing speech recognition of the noisy speech samples based on the trained deep-learning model. Other aspects, embodiments, and features are also claimed and described.
    Type: Application
    Filed: April 18, 2022
    Publication date: October 27, 2022
    Inventors: Zhenyu TANG, Dinesh MANOCHA
  • Publication number: 20220138472
    Abstract: A video is classified as real or fake by extracting facial features, including facial modalities and facial emotions, and speech features, including speech modalities and speech emotions, from the video. The facial and speech modalities are passed through first and second neural networks, respectively, to generate facial and speech modality embeddings. The facial and speech emotions are passed through third and fourth neural networks, respectively, to generate facial and speech emotion embeddings. A first distance, d1, between the facial modality embedding and the speech modality embedding is generated, together with a second distance, d2, between the facial emotion embedding and the speech emotion embedding. The video is classified as fake if a sum of the first distance and the second distance exceeds a threshold distance. The networks may be trained using real and fake video pairs for multiple subjects.
    Type: Application
    Filed: November 1, 2021
    Publication date: May 5, 2022
    Inventors: Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha
  • Publication number: 20210390288
    Abstract: Systems, methods, apparatuses, and computer program products for recognizing human emotion in images or video. A method for recognizing perceived human emotion may include receiving a raw input. The raw input may be processed to generate input data corresponding to at least one context. Features from the raw input data may be extracted to obtain a plurality of feature vectors and inputs. The plurality of feature vectors and the inputs may be transmitted to a respective neural network. At least some of the plurality of feature vectors may be fused to obtain a feature encoding. Additional feature encodings may be computed from the plurality of feature vectors via the respective neural network. A multi-label emotion classification of a primary agent may be performed in the raw input based on the feature encoding and the additional feature encodings.
    Type: Application
    Filed: June 16, 2021
    Publication date: December 16, 2021
    Inventors: Trisha MITTAL, Aniket BERA, Uttaran BHATTACHARYA, Rohan CHANDRA, Pooja GUHAN, Dinesh MANOCHA
  • Publication number: 20210342656
    Abstract: Systems, methods, apparatuses, and computer program products for providing multimodal emotion recognition. The method may include receiving raw input from an input source. The method may also include extracting one or more feature vectors from the raw input. The method may further include determining an effectiveness of the one or more feature vectors. Further, the method may include performing, based on the determination, multiplicative fusion processing on the one or more feature vectors. The method may also include predicting, based on results of the multiplicative fusion processing, one or more emotions of the input source.
    Type: Application
    Filed: February 10, 2021
    Publication date: November 4, 2021
    Inventors: Trisha MITTAL, Aniket BERA, Uttaran BHATTACHARYA, Rohan CHANDRA, Dinesh MANOCHA
  • Patent number: 10679407
    Abstract: Methods, systems, and computer readable media for simulating sound propagation are disclosed. According to one method, the method includes decomposing a virtual environment scene including at least one object into a plurality of surface regions, wherein each of the surface regions includes a plurality of surface patches. The method further includes organizing sound rays generated by a sound source in the virtual environment scene into a plurality of path tracing groups, wherein each of the path tracing groups comprises a group of the rays that traverses a sequence of surface patches. The method also includes determining, for each of the path tracing groups, a sound intensity by combining a sound intensity computed for a current time with one or more previously computed sound intensities respectively associated with previous times and generating a simulated output sound at a listener position using the determined sound intensities.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: June 9, 2020
    Assignee: The University of North Carolina at Chapel Hill
    Inventors: Carl Henry Schissler, Ravish Mehra, Dinesh Manocha
  • Patent number: 10248744
    Abstract: Methods, systems, and computer readable media for acoustic classification and optimization for multi-modal rendering of real-world scenes are disclosed. According to one method for determining acoustic material properties associated with a real-world scene, the method comprises obtaining an acoustic response in a real-world scene. The method also includes generating a three-dimensional (3D) virtual model of the real-world scene. The method further includes determining acoustic material properties of surfaces in the 3D virtual model using a visual material classification algorithm to identify materials in the real-world scene that make up the surfaces and known acoustic material properties of the materials. The method also includes using the acoustic response in the real-world scene to adjust the acoustic material properties.
    Type: Grant
    Filed: February 16, 2017
    Date of Patent: April 2, 2019
    Assignee: THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL
    Inventors: Carl Henry Schissler, Dinesh Manocha
  • Publication number: 20180232471
    Abstract: Methods, systems, and computer readable media for acoustic classification and optimization for multi-modal rendering of real-world scenes are disclosed. According to one method for determining acoustic material properties associated with a real-world scene, the method comprises obtaining an acoustic response in a real-world scene. The method also includes generating a three-dimensional (3D) virtual model of the real-world scene. The method further includes determining acoustic material properties of surfaces in the 3D virtual model using a visual material classification algorithm to identify materials in the real-world scene that make up the surfaces and known acoustic material properties of the materials. The method also includes using the acoustic response in the real-world scene to adjust the acoustic material properties.
    Type: Application
    Filed: February 16, 2017
    Publication date: August 16, 2018
    Inventors: Carl Henry Schissler, Dinesh Manocha
  • Patent number: 9977644
    Abstract: Methods, systems, and computer readable media for conducting interactive sound propagation and rending for a plurality of sound sources in a virtual environment scene are disclosed. According to one method, the method includes decomposing a virtual environment scene containing a plurality of sound sources into a plurality of partitions and forming a plurality of source group clusters, wherein each of the source group clusters includes two or more of the sound sources located within a common partition. The method further includes determining, for each of the source group clusters, a single set of sound propagation paths relative to a listener position and generating a simulated output sound at a listener position using sound intensities associated with the determined sets of sound propagation paths.
    Type: Grant
    Filed: July 29, 2015
    Date of Patent: May 22, 2018
    Assignee: The University of North Carolina at Chapel Hill
    Inventors: Carl Henry Schissler, Dinesh Manocha
  • Patent number: 9940922
    Abstract: Methods, systems, and computer readable media for utilizing ray-parameterized reverberation filters to facilitate interactive sound rendering are disclosed. According to one method, the method includes generating a sound propagation impulse response characterized by a plurality of predefined number of frequency bands and estimating a plurality of reverberation parameters for each of the predefined number of frequency bands of the impulse response. The method further includes utilizing the reverberation parameters to parameterize a plurality of reverberation filters in an artificial reverberator, rendering an audio output in a spherical harmonic (SH) domain that results from a mixing of a source audio and a reverberation signal that is produced from the artificial reverberator, and performing spatialization processing on the audio output.
    Type: Grant
    Filed: August 24, 2017
    Date of Patent: April 10, 2018
    Assignee: THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL
    Inventors: Carl Henry Schissler, Dinesh Manocha
  • Patent number: 9906884
    Abstract: Methods, systems, and computer readable media for utilizing adaptive rectangular decomposition (ARD) to perform head-related transfer function (HRTF) simulations are disclosed herein. According to one method, the method includes obtaining a mesh model representative of head and ear geometry of a listener entity and segmenting a simulation domain of the mesh model into a plurality of partitions. The method further includes conducting an ARD simulation on the plurality of partitions to generate simulated sound pressure signals within each of the plurality of partitions and processing the simulated sound pressure signals to generate at least one HRTF that is customized for the listener entity.
    Type: Grant
    Filed: August 1, 2016
    Date of Patent: February 27, 2018
    Assignee: The University of North Carolina at Chapel Hill
    Inventors: Alok Namdeo Meshram, Dinesh Manocha, Ravish Mehra, Enrique Dunn, Jan-Michael Frahm, Hongsheng Yang
  • Patent number: 9824166
    Abstract: Methods, systems, and computer readable media for utilizing parallel adaptive rectangular decomposition (ARD) to perform acoustic simulations are disclosed herein. According to one method, the method includes assigning, to each of a plurality of processors in a central processing unit (CPU) cluster, ARD processing responsibilities associated with one or more of a plurality of partitions of an acoustic space and determining, by each processor, pressure field data corresponding to the one or more assigned partitions. The method further includes transferring, by each processor, the pressure field data to at least one remote processor that is assigned to a partition that shares an interface with at least one partition assigned to the transferring processor and receiving, by each processor from the at least one remote processor, forcing term values that have been derived by the at least one remote processor using the pressure field data.
    Type: Grant
    Filed: June 18, 2015
    Date of Patent: November 21, 2017
    Assignee: THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL
    Inventors: Nicolas Manuel Morales, Ravish Mehra, Dinesh Manocha
  • Patent number: 9711126
    Abstract: The subject matter described herein includes an approach for wave-based sound propagation suitable for large, open spaces spanning hundreds of meters, with a small memory footprint. The scene is decomposed into disjoint rigid objects. The free-field acoustic behavior of each object is captured by a compact per-object transfer-function relating the amplitudes of a set of incoming equivalent sources to outgoing equivalent sources. Pairwise acoustic interactions between objects are cornuted analytically, yielding compact inter-object transfer functions. The global sound field accounting for all orders of interaction is computed using these transfer functions. The runtime system uses fast summation over the outgoing equivalent source amplitudes for all objects to auralize the sound field at a moving listener in real-time. We demonstrate realistic acoustic effects such as diffraction, low-passed sound behind obstructions, focusing, scattering, high-order reflections, and echoes, on a variety of scenes.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: July 18, 2017
    Assignee: The University of North Carolina at Chapel Hill
    Inventors: Ravish Mehra, Dinesh Manocha
  • Patent number: 9560439
    Abstract: Methods, systems, and computer readable media for supporting source or listener directivity in a wave-based sound propagation model are disclosed. According to one method, the method includes computing, prior to run-time, one or more sound fields associated with a source or listener position and modeling, at run-time and using the one or more sound fields and a wave-based sound propagation model, source or listener directivity in an environment.
    Type: Grant
    Filed: June 30, 2014
    Date of Patent: January 31, 2017
    Assignee: The University of North Carolina at Chapel Hills
    Inventors: Ravish Mehra, Lakulish Shailesh Antani, Dinesh Manocha
  • Patent number: 9398393
    Abstract: The subject matter described herein includes a method for simulating directional sound reverberation. The method includes performing ray tracing from a listener position in a scene to surface as visible from a listener position. The method further includes determining a directional local visibility representing a distance from a listener position to nearer surface in the scene alone each ray. The method further includes determining directional reverberation at the listener position based on the directional local visibility. The method further includes rendering a simulated sound indicative of the directional reverberation at the listener position.
    Type: Grant
    Filed: November 15, 2013
    Date of Patent: July 19, 2016
    Assignee: The University of North Carolina at Chapel Hill
    Inventors: Lakulish Shailesh Antani, Dinesh Manocha
  • Publication number: 20160171131
    Abstract: Methods, systems, and computer readable media for utilizing parallel adaptive rectangular decomposition (ARD) to perform acoustic simulations are disclosed herein. According to one method, the method includes assigning, to each of a plurality of processors in a central processing unit (CPU) cluster, ARD processing responsibilities associated with one or more of a plurality of partitions of an acoustic space and determining, by each processor, pressure field data corresponding to the one or more assigned partitions. The method further includes transferring, by each processor, the pressure field data to at least one remote processor that is assigned to a partition that shares an interface with at least one partition assigned to the transferring processor and receiving, by each processor from the at least one remote processor, forcing term values that have been derived by the at least one remote processor using the pressure field data.
    Type: Application
    Filed: June 18, 2015
    Publication date: June 16, 2016
    Inventors: Nicolas Manuel Morales, Ravish Mehra, Dinesh Manocha
  • Publication number: 20160034248
    Abstract: Methods, systems, and computer readable media for conducting interactive sound propagation and rending for a plurality of sound sources in a virtual environment scene are disclosed. According to one method, the method includes decomposing a virtual environment scene containing a plurality of sound sources into a plurality of partitions and forming a plurality of source group clusters, wherein each of the source group clusters includes two or more of the sound sources located within a common partition. The method further includes determining, for each of the source group clusters, a single set of sound propagation paths relative to a listener position and generating a simulated output sound at a listener position using sound intensities associated with the determined sets of sound propagation paths.
    Type: Application
    Filed: July 29, 2015
    Publication date: February 4, 2016
    Inventors: Carl Henry Schissler, Dinesh Manocha
  • Publication number: 20150378019
    Abstract: Methods, systems, and computer readable media for simulating sound propagation are disclosed. According to one method, the method includes decomposing a virtual environment scene including at least one object into a plurality of surface regions, wherein each of the surface regions includes a plurality of surface patches. The method further includes organizing sound rays generated by a sound source in the virtual environment scene into a plurality of path tracing groups, wherein each of the path tracing groups comprises a group of the rays that traverses a sequence of surface patches. The method also includes determining, for each of the path tracing groups, a sound intensity by combining a sound intensity computed for a current time with one or more previously computed sound intensities respectively associated with previous times and generating a simulated output sound at a listener position using the determined sound intensities.
    Type: Application
    Filed: June 29, 2015
    Publication date: December 31, 2015
    Inventors: Carl Henry Schissler, Ravish Mehra, Dinesh Manocha