Patents by Inventor KARTHIK MOHAN KUMAR

KARTHIK MOHAN KUMAR has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250111578
    Abstract: Methods and systems are provided for generating a stylized representation of a non-player character (NPC) in a virtual environment. A multimodal plurality of inputs regarding characteristics of the NPC is received, which is processed to generate visual data representing the NPC's appearance and to generate behavior data representing the NPC's actions. The generated visual data and behavior data are adapted to a selected character model to create an adapted configuration model, which is used to generate rendering information for the NPC.
    Type: Application
    Filed: September 27, 2024
    Publication date: April 3, 2025
    Inventors: Karthik Mohan Kumar, Archana Ramalingam, Michael Mantor, Pedro Antonio Pena
  • Publication number: 20240424398
    Abstract: Systems and techniques for generating and animating non-player characters (NPCs) within virtual digital environments are provided. Multimodal input data is received that comprises a plurality of input modalities for interaction with an NPC having a set of body features and a set of facial features. The multimodal input data is processed through one or more neural networks to generate animation sequences for both the body features and facial features of the NPC. Generating such animation sequences includes disentangling the multimodal input data to generate substantially disentangled latent representations, combining these representations with the multimodal input data, and using a large-language model (LLM) to generate speech data for the NPC. Further processing using reverse diffusion generates face vertex displacement data and joint trajectory data based on the combined representation and generated speech data.
    Type: Application
    Filed: June 20, 2024
    Publication date: December 26, 2024
    Inventors: Karthik Mohan Kumar, Michael Mantor, Pedro Antonio Pena, Archana Ramalingam
  • Publication number: 20240428494
    Abstract: Systems and techniques for generating and animating non-player characters (NPCs) within virtual digital environments are provided. Multimodal input data is received that comprises a plurality of input modalities for interaction with an NPC having a set of body features and a set of facial features. The multimodal input data is processed through one or more neural networks to generate animation sequences for both the body features and facial features of the NPC. Generating such animation sequences includes disentangling the multimodal input data to generate substantially disentangled latent representations, combining these representations with the multimodal input data, and using a large-language model (LLM) to generate speech data for the NPC. Further processing using reverse diffusion generates face vertex displacement data and joint trajectory data based on the combined representation and generated speech data.
    Type: Application
    Filed: June 20, 2024
    Publication date: December 26, 2024
    Inventors: Karthik Mohan Kumar, Michael Mantor, Pedro Antonio Pena, Archana Ramalingam
  • Publication number: 20240424407
    Abstract: Systems and techniques for generating and animating non-player characters (NPCs) within virtual digital environments are provided. Multimodal input data is received that comprises a plurality of input modalities for interaction with an NPC having a set of body features and a set of facial features. The multimodal input data is processed through one or more neural networks to generate animation sequences for both the body features and facial features of the NPC. Generating such animation sequences includes disentangling the multimodal input data to generate substantially disentangled latent representations, combining these representations with the multimodal input data, and using a large-language model (LLM) to generate speech data for the NPC. Further processing using reverse diffusion generates face vertex displacement data and joint trajectory data based on the combined representation and generated speech data.
    Type: Application
    Filed: June 20, 2024
    Publication date: December 26, 2024
    Inventors: Karthik Mohan Kumar, Michael Mantor, Pedro Antonio Pena, Archana Ramalingam
  • Patent number: 11094324
    Abstract: A method includes detecting a keyword within an audio stream. The keyword is one of multiple keywords in a database, in which each of the multiple keywords relates to at least one of multiple domains in the database. The database stores a first confidence weight for each of the multiple keywords that are related to a first domain among the multiple domains. Each first confidence weight indicates a probability that a corresponding keyword relates to the first domain. The method includes determining whether a first confidence weight of the keyword is at least equal to an activation threshold value associated with the first domain. The method includes, in response to the first confidence weight of the keyword meeting the activation threshold value, activating a DS-ASR engine corresponding with the first domain to perform speech-to-text conversion on the audio stream.
    Type: Grant
    Filed: May 14, 2019
    Date of Patent: August 17, 2021
    Assignee: Motorola Mobility LLC
    Inventors: Zhengping Ji, Leo S. Woiceshyn, Karthik Mohan Kumar, Yi Wu
  • Patent number: 11030994
    Abstract: A method and data processing device for detecting a communication between a first and second entity. The method includes identifying whether a previous communication between the first and second entity has been detected. In response to identifying that the previous communication between the first and second entity has been detected, the method determines an elapsed time since detection of the previous communication. The method predicts a topic of the communication, in part based on the determined elapsed time. The topic corresponds to a specific domain from among a plurality of available domains for automatic speech recognition (ASR) processing. The method triggers selection and activation of a first domain specific (DS) ASR engine from among a plurality of available DS ASR engines to utilize a smaller resource footprint than a general ASR engine and facilitate recognition of specific vocabulary and context, in part, based on the elapsed time since the previous communication.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: June 8, 2021
    Assignee: Motorola Mobility LLC
    Inventors: Zhengping Ji, Leo S. Woiceshyn, Karthik Mohan Kumar, Yi Wu, Thomas Y. Merrell
  • Publication number: 20210021706
    Abstract: A method, a communication device, and a computer program product for identifying a live phone call. The method includes receiving, at a first communication device, an activation of a verification mode for a phone call. The method includes receiving, from a second communication device on the phone call, first audio data associated with the phone call. The method further includes determining, via a processor of the first communication device, if the first audio data contains machine originated audio, and in response to determining that the first audio data does not contain machine originated audio , generating and outputting an alert that the phone call is live.
    Type: Application
    Filed: July 17, 2019
    Publication date: January 21, 2021
    Inventors: JARRETT K. SIMERSON, LEO S. WOICESHYN, KARTHIK MOHAN KUMAR, YI WU, THOMAS Y. MERRELL
  • Patent number: 10887459
    Abstract: A method, a communication device, and a computer program product for identifying a live phone call. The method includes receiving, at a first communication device, an activation of a verification mode for a phone call. The method includes receiving, from a second communication device on the phone call, first audio data associated with the phone call. The method further includes determining, via a processor of the first communication device, if the first audio data contains machine originated audio, and in response to determining that the first audio data does not contain machine originated audio, generating and outputting an alert that the phone call is live.
    Type: Grant
    Filed: July 17, 2019
    Date of Patent: January 5, 2021
    Assignee: Motorola Mobility LLC
    Inventors: Jarrett K. Simerson, Leo S. Woiceshyn, Karthik Mohan Kumar, Yi Wu, Thomas Y. Merrell
  • Publication number: 20200365148
    Abstract: A method includes detecting a keyword within an audio stream. The keyword is one of multiple keywords in a database, in which each of the multiple keywords relates to at least one of multiple domains in the database. The database stores a first confidence weight for each of the multiple keywords that are related to a first domain among the multiple domains. Each first confidence weight indicates a probability that a corresponding keyword relates to the first domain. The method includes determining whether a first confidence weight of the keyword is at least equal to an activation threshold value associated with the first domain. The method includes, in response to the first confidence weight of the keyword meeting the activation threshold value, activating a DS-ASR engine corresponding with the first domain to perform speech-to-text conversion on the audio stream.
    Type: Application
    Filed: May 14, 2019
    Publication date: November 19, 2020
    Inventors: ZHENGPING JI, LEO S. WOICESHYN, KARTHIK MOHAN KUMAR, YI WU
  • Publication number: 20200342853
    Abstract: A method and data processing device for detecting a communication between a first and second entity. The method includes identifying whether a previous communication between the first and second entity has been detected. In response to identifying that the previous communication between the first and second entity has been detected, the method determines an elapsed time since detection of the previous communication. The method predicts a topic of the communication, in part based on the determined elapsed time. The topic corresponds to a specific domain from among a plurality of available domains for automatic speech recognition (ASR) processing. The method triggers selection and activation of a first domain specific (DS) ASR engine from among a plurality of available DS ASR engines to utilize a smaller resource footprint than a general ASR engine and facilitate recognition of specific vocabulary and context, in part, based on the elapsed time since the previous communication.
    Type: Application
    Filed: April 24, 2019
    Publication date: October 29, 2020
    Inventors: ZHENGPING JI, LEO S. WOICESHYN, KARTHIK MOHAN KUMAR, YI WU, THOMAS Y. MERRELL