Patents Examined by Michael N. Opsasnick

Speech coding using content latent embedding vectors and speaker latent embedding vectors

Patent number: 11756561

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.

Type: Grant

Filed: February 17, 2022

Date of Patent: September 12, 2023

Assignee: DeepMind Technologies Limited

Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
Source separation method, apparatus, and non-transitory computer-readable medium

Patent number: 11741343

Abstract: A source separation method, an apparatus, and a non-transitory computer-readable medium are provided. Atrous Spatial Pyramid Pooling (ASPP) is used to reduce the number of parameters of a model and speed up computation. Conventional upsampling is replaced with a conversion between time and depth, and a receptive field preserving decoder is provided. In addition, temporal attention with dynamic convolution kernel is added, to further achieve lightweight and improve the effect of separation.

Type: Grant

Filed: November 27, 2019

Date of Patent: August 29, 2023

Assignee: National Central University

Inventors: Jia-Ching Wang, Yao-Ting Wang
Cooperative neural networks with spatial containment constraints

Patent number: 11734576

Abstract: Methods, systems, and computer program products for cooperative neural networks with spatial containment constraints are provided herein. A computer-implemented method includes dividing a processing task into multiple sub-tasks; training multiple independent neural networks, such that at least some of the multiple sub-tasks correspond to different ones of the multiple independent neural networks; defining, via implementing constraint-based domain knowledge related to the processing task in connection with the multiple independent neural networks, a constraint loss for a given one of the multiple sub-tasks, the constraint loss being dependent on output from at least one of the other multiple sub-tasks; and effecting re-training of at least a portion of the multiple independent neural networks by incorporating the constraint loss into at least one of the multiple independent neural networks.

Type: Grant

Filed: April 14, 2020

Date of Patent: August 22, 2023

Assignee: International Business Machines Corporation

Inventors: Xin Ru Wang, Xinyi Zheng, Douglas R. Burdick, Ioannis Katsis
Predictive injection of conversation fillers for assistant systems

Patent number: 11704900

Abstract: In one embodiment, a method includes, by a client system, receiving, at the client system, a first user input, processing by the client system, the first user input to provide an initial response by identifying one or more entities referenced by the first user input and providing, by the client system, the initial response, where the initial response includes a conversational filler referencing at least one of the one or more identified entities, processing the first user input to provide a complete response by identifying, by the client system, one or more intents and one or more slots associated with the first user input based on a semantic analysis by a natural-language understanding module, and providing, by the client system, the complete response subsequent to the initial response, where the complete response is based on the one or more intents and the one or more slots.

Type: Grant

Filed: February 7, 2022

Date of Patent: July 18, 2023

Assignee: Meta Platforms, Inc.

Inventors: Emmanouil Koukoumidis, Michael Robert Hanson, Mohsen M Agsen
Systems and methods for generating dynamic conversational responses through aggregated outputs of machine learning models

Patent number: 11694038

Abstract: Methods and systems are described herein for generating dynamic conversational responses. For example, dynamic conversational responses may facilitate an interactive exchange with users. Therefore, the methods and systems used specialized methods to enriched data that may be indicative of a user's intent prior to processing that data through the machine learning model, as well as a specialized architecture for the machine learning models that take advantage of the user interface format. For example, a first machine learning model may be trained using a multi-class cross entropy loss function, and a second machine learning model may be trained using a binary cross entropy loss function. A third output may be determined based on a weighted average of first and second outputs from the first and second machine learning models, and a subset of dynamic conversational responses may be generated based on the determined third output.

Type: Grant

Filed: September 23, 2020

Date of Patent: July 4, 2023

Assignee: Capital One Services, LLC

Inventor: Minh Le
Systems and methods to utilize text representations of conversations

Patent number: 11688400

Abstract: A method for electronically utilizing content in a communication between a customer and a customer representative is provided. An audible conversation between a customer and a service representative is captured. At least a portion of the audible conversation is converted into computer searchable data. The computer searchable data is analyzed during the audible conversation to identify relevant meta tags previously stored in a data repository or generated during the audible conversation. Each meta tag is associated with the customer. Each meta tag provides a contextual item determined from at least a portion of one of a current or previous conversation with the customer. A meta tag determined to be relevant to the current conversation between the service representative and the customer is displayed in real time to the service representative currently conversing with the customer.

Type: Grant

Filed: April 21, 2021

Date of Patent: June 27, 2023

Assignee: United Services Automobile Association (“USAA”)

Inventors: Zakery L. Johnson, Jonathan E. Neuse
System and method for recognizing domain specific named entities using domain specific word embeddings

Patent number: 11687721

Abstract: Systems and methods for recognizing domain specific named entities are disclosed. An example method may be performed by one or more processors of a text incorporation system and include extracting a number of terms from a text under consideration, identifying, among the number of terms, a set of unmatched terms that do not match any of a plurality of known terms, passing each respective unmatched term to a vectorization module, embedding a vectorized version of each respective unmatched term in a vector space, comparing each vectorized version to known term vectors, passing, to a machine learning model, candidate terms corresponding to known term vectors closest to the vectorized versions, identifying, using the machine learning model, a best candidate term for each respective unmatched term, mapping the best candidate terms to unmatched terms in the text under consideration, and incorporating the text under consideration into the system based on the mappings.

Type: Grant

Filed: July 20, 2021

Date of Patent: June 27, 2023

Assignee: Intuit Inc.

Inventors: Conrad De Peuter, Karpaga Ganesh Patchirajan, Saikat Mukherjee
Contrastive self-supervised machine learning for commonsense reasoning

Patent number: 11687733

Abstract: In an example embodiment, a self-supervised learning task is used for training commonsense-aware representations in a minimally supervised fashion and a pair level mutual-exclusive loss is used to enforce commonsense knowledge during representation learning. This helps to exploit the mutual-exclusive nature of the training samples of commonsense reasoning corpora. Given two pieces of input where the only difference between them are trigger pieces of data, it may be postulated that the pairwise pronoun disambiguation is mutually exclusive. This idea is formulated using a contrastive loss and then this is used to update the language model.

Type: Grant

Filed: June 25, 2020

Date of Patent: June 27, 2023

Assignee: SAP SE

Inventors: Tassilo Klein, Moin Nabi
Mandarin and dialect mixed modeling and speech recognition

Patent number: 11688391

Abstract: The present disclosure provides a modeling method for speech recognition and a device. The method includes: determining N types of tags; training a neural network according to speech data of Mandarin to generate a recognition model whose outputs are the N types of tags; inputting speech data of each dialect into the recognition model to obtain an output tag of each frame of the speech data of each dialect; determining, according to the output tags and tagged true tags, error rates of the N types of tags for the each dialect, generating M types of target tags according to tags with error rates greater than a preset threshold; and training an acoustic model according to third speech data of Mandarin and third speech data of the P dialects, outputs of the acoustic model being the N types of tags and the M types of target tags corresponding to each dialect.

Type: Grant

Filed: April 8, 2020

Date of Patent: June 27, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO.

Inventor: Shenglong Yuan
Wearable vibrotactile speech aid

Patent number: 11688386

Abstract: A method for training vibrotactile speech perception in the absence of auditory speech includes selecting a first word, generating a first control signal configured to cause at least one vibrotactile transducer to vibrate against a person's body with a first vibration pattern based on the first word, sampling a second word spoken by the person, generating a second control signal configured to cause at least one vibrotactile transducer to vibrate against the person's body with a second vibration pattern based on the sampled second word, and presenting a comparison between the first word and the second word to the person. An array of vibrotactile transducers can be in contact with the person's body.

Type: Grant

Filed: August 31, 2018

Date of Patent: June 27, 2023

Assignee: Georgetown University

Inventors: Patrick S. Malone, Maximilian Riesenhuber
Creating line item information from free-form tabular data

Patent number: 11687549

Abstract: The present disclosure involves systems, software, and computer implemented methods for creating line item information from tabular data. One example method includes receiving event data values at a system. Column headers of columns in the event data values are identified. At least one column header is not included in standard line item terms used by the system. Column values of the columns in the event data values are identified. The identified column headers and the identified column values are processed using one or more models to map each column to a standard line item term used by the system. The processing includes using context determination and content recognition to identify standard line item terms. An event is created in the system, including the creation of line items from the identified column value. Each line item includes standard line item terms mapped to the columns.

Type: Grant

Filed: October 20, 2021

Date of Patent: June 27, 2023

Assignee: SAP SE

Inventors: Kumaraswamy Gowda, Nithya Rajagopalan, Nishant Kumar, Panish Ramakrishna, Rajendra Vuppala, Erica Vandenhoek
Voice interactions in noisy environments

Patent number: 11682416

Abstract: Providing contextual help in an interactive voice system includes receiving a plurality of user interaction events during a user interaction window, wherein each of the user interaction events comprises one of a low quality voice transcription event from a speech-to-text (STT) service or a no-intent matching event from a natural language processing (NLP) service and receiving a respective transcription confidence score from the STT service for each of the plurality of user interaction events. For a one of the plurality of user interaction events, a determination is made of how to respond to a user providing the user interaction events based on how many events comprise the plurality of events and the transcription confidence score for the one event; and then instructions are provided to cause the determined response to be presented to the user in accordance with the determination of how to respond.

Type: Grant

Filed: August 3, 2018

Date of Patent: June 20, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Igor Ramos, Marc Dickenson
Apparatus for voice-age adjusting an input voice signal according to a desired age

Patent number: 11646021

Abstract: According to one embodiment, an apparatus for processing a voice signal includes a display configured to display an image of a user or a character corresponding to the user, a microphone, a speaker configured to output a voice signal of the user, a memory configured to store a trained voice age conversion model, and a processor configured to, based on changing an age of the user or the character displayed on the display, control the display such that the display displays the user or the character corresponding to the changed age. The processor is further configured to determine a first age that is a current age of the user or the character based on the voice signal of the user inputted through the microphone. Accordingly, convenience of a user may be enhanced.

Type: Grant

Filed: April 16, 2020

Date of Patent: May 9, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Siyoung Yang, Yongchul Park, Sungmin Han, Sangki Kim, Juyeong Jang, Minook Kim
Channel pruning of a convolutional network based on gradient descent optimization

Patent number: 11631004

Abstract: Techniques and mechanisms for determining the pruning of one or more channels from a convolutional neural network (CNN) based on a gradient descent analysis of a performance loss. In an embodiment, a mask layer selectively masks one or more channels which communicate data between layers of the CNN. The CNN provides an output, and calculations are performed to determine a relationship between the masking and a loss of the CNN. The various masking of different channels is based on respective random variables and on probability values each corresponding to a different respective channel In another embodiment, the masking is further based on a continuous mask function which approximates a binary step function.

Type: Grant

Filed: March 28, 2018

Date of Patent: April 18, 2023

Assignee: Intel Corporation

Inventor: Alexey Kruglov
System and method for detecting cognitive decline using speech analysis

Patent number: 11631395

Abstract: System and method for detecting cognitive decline in a subject using a classification system for detecting cognitive decline in the subject based on a speech sample. The classification system is trained using speech data corresponding to audio recordings of speech from normal and cognitive decline patients to generate an ensemble classifier comprising a plurality of component classifiers and an ensemble module. Each of the plurality of component classifiers is a machine-learning classifier configured to generate a component output identifying a sample data as corresponding to a normal patient or a cognitive patient. The machine-learning classifier is generated based on a subset of available features. The ensemble module receives component outputs from all of the component classifiers and generates an ensemble output identifying the sample data as corresponding to a normal or cognitive decline patient based on the component outputs.

Type: Grant

Filed: April 14, 2020

Date of Patent: April 18, 2023

Assignee: Janssen Pharmaceutica NV

Inventors: Srinivasan Vairavan, Vaibhav Narayan
Audio processing for voice encoding and decoding using spectral shaper model

Patent number: 11621009

Abstract: The present disclosure relates to an audio encoding and decoding (codec) system for voice encoding/decoding using a spectral shaper model. In an embodiment, a method of audio signal decoding comprises: receiving a bit stream associated with an audio signal, the bit stream including encoded transform coefficients, spectral envelope data and one or more parameters of a spectral shaper model, the spectral shaper model indicative of a fundamental frequency of a multi-sinusoidal signal model, where the fundamental frequency corresponds to a time domain delay; decoding the encoded transform coefficients; adjusting the decoded transform coefficients using the spectral envelope data and the spectral shaper model; reconstructing transform coefficients of the audio signal using the adjusted, decoded transform coefficients; and transforming the reconstructed transform coefficients into a time domain audio signal.

Type: Grant

Filed: December 18, 2019

Date of Patent: April 4, 2023

Assignee: Dolby International AB

Inventors: Lars Villemoes, Janusz Klejsa, Per Hedelin
Systems and methods for associating a voice command with a search image

Patent number: 11621000

Abstract: Embodiments described herein include systems and methods for using image searching with voice recognition commands. Embodiments of a method may include providing a user interface via a target application and receiving a user selection of an area on the user interface by a user, the area including a search image. Embodiments may also include receiving an associated voice command and associating, by the computing device, the associated voice command with the search image.

Type: Grant

Filed: April 27, 2021

Date of Patent: April 4, 2023

Assignee: Dolbey & Company, Inc.

Inventor: Curtis A. Weeks
Machine reading comprehension method, machine reading comprehension device and non-transient computer readable medium for building a memory feature fusion neural network with multiple rounds of dialogue memory feature

Patent number: 11610067

Abstract: A machine reading comprehension method includes the following operations: performing a relation augment self attention (RASA) feature extraction process on at least one historical dialogue data and a current question data respectively to obtain at least one historical dialogue feature and a current question feature; and performing a machine reading comprehension (MRC) analysis according to the at least one historical dialogue feature and the current question feature to obtain a response output.

Type: Grant

Filed: November 18, 2020

Date of Patent: March 21, 2023

Assignee: INSTITUTE FOR INFORMATION INDUSTRY

Inventors: Wei-Jen Yang, Yu-Shian Chiu, Guann-Long Chiou
Systems and methods for reflexive questionnaire generation

Patent number: 11610058

Abstract: Provided methods and systems allow dynamic rendering of a reflexive questionnaire based on a modifiable spreadsheet for users with little to no programming experience and knowledge. Some methods comprise receiving a modifiable spreadsheet with multiple rows, each row comprising rendering instructions for a reflexive questionnaire from a first computer, such as a data type cell, statement cell, logic cell, and a field identifier; rendering a graphical user interface, on a second computer, comprising a label and an input element corresponding to the rendering instructions of a first row of the spreadsheet; receiving an input from the second computer; evaluating the input against the logic cell of the spreadsheet; in response to the input complying with the logic cell of the spreadsheet, dynamically rendering a second label and a second input element to be displayed on the graphical user interface based on the logic of the first row.

Type: Grant

Filed: January 28, 2020

Date of Patent: March 21, 2023

Assignee: HITPS LLC

Inventors: Mark Sayre, Harish Krishnaswamy, Sam Elsamman
Method and apparatus for intelligent automated chatting

Patent number: 11599729

Abstract: The present disclosure provides a method for intelligent automated chatting. A conversation with a user is performed by using a first identity of a first artificial intelligence entity. A message is received from the user in the conversation. Matching rates between the message and trigger contents of other artificial intelligence entities are scored. A second artificial intelligence entity is selected from the other artificial intelligence entities based on the matching rates. A conversation with the user is performed by using a second identity of the second artificial intelligence entity by switching from the first identity of the first artificial intelligence entity to the second identity of the second artificial intelligence entity.

Type: Grant

Filed: June 15, 2017

Date of Patent: March 7, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor: Xianchao Wu

prev 1 2 3 4 5 6 7 … next