Patents by Inventor Ramesh Radhakrishna Manuvinakurike

Ramesh Radhakrishna Manuvinakurike has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

METHODS AND APPARATUS TO CONTROLLABLE MULTIMODAL MEETING SUMMARIZATION WITH SEMANTIC ENTITIES AUGMENTATION

Publication number: 20230343331

Abstract: Disclosed is a technical solution to summarize a multimodal conferencing environment. The solution is designed to improve efficiency and accuracy of computing systems as a summarization tool by incorporating memory, machine readable instructions, and processor circuitry. The solution executes the functions of adjusting a language model based on a terminology utilized in a first context data; generating a conversation summary from a transcription and a human controlled variable; extracting a semantic entity from the conversation summary and second context data, where the second context data is indicative of an input associated with a conferencing environment; and summarize the semantic entity and the second context data using the adjusted language model.

Type: Application

Filed: June 27, 2023

Publication date: October 26, 2023

Inventors: Ramesh Radhakrishna Manuvinakurike, Saurav Sahay, Sangeeta Ghangam Manepalli, Sumanta Bhattacharyya, Sahisnu Mazumder, Cagri Cagatay Tanriover
PROCESSING VIDEOS BASED ON TEMPORAL STAGES

Publication number: 20230124495

Abstract: Disclosed is a technical solution to process a video that captures actions to be performed for completing a task based on a chronological sequence of stages within the task. An example system may identify an action sequence from an instruction for the task. The system inputs the action sequence into a trained model (e.g., a recurrent neural network), which outputs the chronological sequence of stages. The RNN may be trained through self-supervised learning. The system may input the video and the chronological sequence of stages into another trained model, e.g., a temporal convolutional network. The other trained model may include hidden layers arranged before an attention layer. The hidden layers may extract features from the video and feed the features into the attention layer. The attention layer may determine attention weights of the features based on the chronological sequence of stages.

Type: Application

Filed: October 28, 2022

Publication date: April 20, 2023

Applicant: Intel Corporation

Inventors: Sovan Biswas, Anthony Daniel Rhodes, Ramesh Radhakrishna Manuvinakurike, Giuseppe Raffa, Richard Beckwith
Domain-specific speech recognizers in a digital medium environment

Patent number: 10586528

Abstract: Domain-specific speech recognizer generation with crowd sourcing is described. The domain-specific speech recognizers are generated for voice user interfaces (VUIs) configured to replace or supplement application interfaces. In accordance with the described techniques, the speech recognizers are generated for a respective such application interface and are domain-specific because they are each generated based on language data that corresponds to the respective application interface. This domain-specific language data is used to build a domain-specific language model. The domain-specific language data is also used to collect acoustic data for building an acoustic model. In particular, the domain-specific language data is used to generate user interfaces that prompt crowd-sourcing participants to say selected words represented by the language data for recording. The recordings of these selected words are then used to build the acoustic model.

Type: Grant

Filed: February 2, 2017

Date of Patent: March 10, 2020

Assignee: Adobe Inc.

Inventors: Ramesh Radhakrishna Manuvinakurike, Trung Huu Bui, Robert S. N. Dates
Natural language image editing annotation framework

Patent number: 10579737

Abstract: A framework for annotating image edit requests includes a structure for identifying natural language request as either comments or image edit requests and for identifying the text of a request that maps to an executable action in an image editing program, as well as to identify other entities from the text related to the action. The annotation framework can be used to aid in the creation of artificial intelligence networks that carry out the requested action. An example method includes displaying a test image, displaying a natural language input with selectable text, and providing a plurality of selectable action tag controls and entity tag controls. The method may also include receiving selection of the text, receiving selection of an action tag control for the selected text, generating a labeled pair, and storing the labeled pair with the natural language input as an annotated natural language image edit request.

Type: Grant

Filed: March 6, 2018

Date of Patent: March 3, 2020

Assignee: Adobe Inc.

Inventors: Jacqueline Brixey, Walter W. Chang, Trung Bui, Doo Soon Kim, Ramesh Radhakrishna Manuvinakurike
Multiple turn conversational task assistance

Patent number: 10453455

Abstract: A technique for multiple turn conversational task assistance includes receiving data representing a conversation between a user and an agent. The conversation includes a digitally recorded video portion and a digitally recorded audio portion, where the audio portion corresponds to the video portion. Next, the audio portion is segmented into a plurality of audio chunks. For each of the audio chunks, a transcript of the respective audio chunk is received. Each of the audio chunks is grouped into one or more dialog acts, where each dialog act includes at least one of the respective audio chunks, the validated transcript corresponds to the respective audio chunks, and a portion of the video portion corresponds to the respective audio chunk. Each of the dialog acts is stored in a data corpus.

Type: Grant

Filed: November 22, 2017

Date of Patent: October 22, 2019

Assignee: ADOBE INC.

Inventors: Ramesh Radhakrishna Manuvinakurike, Trung Huu Bui, Walter W. Chang
NATURAL LANGUAGE IMAGE EDITING ANNOTATION FRAMEWORK

Publication number: 20190278844

Abstract: A framework for annotating image edit requests includes a structure for identifying natural language request as either comments or image edit requests and for identifying the text of a request that maps to an executable action in an image editing program, as well as to identify other entities from the text related to the action. The annotation framework can be used to aid in the creation of artificial intelligence networks that carry out the requested action. An example method includes displaying a test image, displaying a natural language input with selectable text, and providing a plurality of selectable action tag controls and entity tag controls. The method may also include receiving selection of the text, receiving selection of an action tag control for the selected text, generating a labeled pair, and storing the labeled pair with the natural language input as an annotated natural language image edit request.

Type: Application

Filed: March 6, 2018

Publication date: September 12, 2019

Inventors: Jacqueline Brixey, Walter W. Chang, Trung Bui, Doo Soon Kim, Ramesh Radhakrishna Manuvinakurike
MULTIPLE TURN CONVERSATIONAL TASK ASSISTANCE

Publication number: 20190156822

Abstract: A technique for multiple turn conversational task assistance includes receiving data representing a conversation between a user and an agent. The conversation includes a digitally recorded video portion and a digitally recorded audio portion, where the audio portion corresponds to the video portion. Next, the audio portion is segmented into a plurality of audio chunks. For each of the audio chunks, a transcript of the respective audio chunk is received. Each of the audio chunks is grouped into one or more dialog acts, where each dialog act includes at least one of the respective audio chunks, the validated transcript corresponds to the respective audio chunks, and a portion of the video portion corresponds to the respective audio chunk. Each of the dialog acts is stored in a data corpus.

Type: Application

Filed: November 22, 2017

Publication date: May 23, 2019

Applicant: Adobe Inc.

Inventors: Ramesh Radhakrishna Manuvinakurike, Trung Huu Bui, Walter W. Chang
Domain-Specific Speech Recognizers in a Digital Medium Environment

Publication number: 20180218728

Abstract: Domain-specific speech recognizer generation with crowd sourcing is described. The domain-specific speech recognizers are generated for voice user interfaces (VUIs) configured to replace or supplement application interfaces. In accordance with the described techniques, the speech recognizers are generated for a respective such application interface and are domain-specific because they are each generated based on language data that corresponds to the respective application interface. This domain-specific language data is used to build a domain-specific language model. The domain-specific language data is also used to collect acoustic data for building an acoustic model. In particular, the domain-specific language data is used to generate user interfaces that prompt crowd-sourcing participants to say selected words represented by the language data for recording. The recordings of these selected words are then used to build the acoustic model.

Type: Application

Filed: February 2, 2017

Publication date: August 2, 2018

Applicant: Adobe Systems Incorporated

Inventors: Ramesh Radhakrishna Manuvinakurike, Trung Huu Bui, Robert S. N. Dates