Neural Network Patents (Class 704/232)

Method for detecting speech in audio data

Patent number: 11967340

Abstract: Disclosed is a method for detecting a voice from audio data, performed by a computing device according to an exemplary embodiment of the present disclosure. The method includes obtaining audio data; generating image data based on a spectrum of the obtained audio data; analyzing the generated image data by utilizing a pre-trained neural network model; and determining whether an automated response system (ARS) voice is included in the audio data, based on the analysis of the image data.

Type: Grant

Filed: June 23, 2023

Date of Patent: April 23, 2024

Assignee: ActionPower Corp.

Inventors: Subong Choi, Dongchan Shin, Jihwa Lee
Contrastive Siamese network for semi-supervised speech recognition

Patent number: 11961515

Abstract: A method includes receiving a plurality of unlabeled audio samples corresponding to spoken utterances not paired with corresponding transcriptions. At a target branch of a contrastive Siamese network, the method also includes generating a sequence of encoder outputs for the plurality of unlabeled audio samples and modifying time characteristics of the encoder outputs to generate a sequence of target branch outputs. At an augmentation branch of a contrastive Siamese network, the method also includes performing augmentation on the unlabeled audio samples, generating a sequence of augmented encoder outputs for the augmented unlabeled audio samples, and generating predictions of the sequence of target branch outputs generated at the target branch. The method also includes determining an unsupervised loss term based on target branch outputs and predictions of the sequence of target branch outputs. The method also includes updating parameters of the audio encoder based on the unsupervised loss term.

Type: Grant

Filed: December 14, 2021

Date of Patent: April 16, 2024

Assignee: Google LLC

Inventors: Jaeyoung Kim, Soheil Khorram, Hasim Sak, Anshuman Tripathi, Han Lu, Qian Zhang
Denoising depth data of low-signal pixels

Patent number: 11941787

Abstract: Examples are provided relating to recovering depth data from noisy phase data of low-signal pixels. One example provides a computing system, comprising a logic machine, and a storage machine holding instructions executable by the logic machine to process depth data by obtaining depth image data and active brightness image data for a plurality of pixels, the depth image data comprising phase data for a plurality of frequencies, and identifying low-signal pixels based at least on the active brightness image data. The instructions are further executable to apply a denoising filter to phase data of the low-signal pixels to obtain denoised phase data and not applying the denoising filter to phase data of other pixels. The instructions are further executable to, after applying the denoising filter, perform phase unwrapping on the phase data for the plurality of frequencies to obtain a depth image, and output the depth image.

Type: Grant

Filed: August 23, 2021

Date of Patent: March 26, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Sergio Ortiz Egea, Augustine Cha
Hybrid multilingual text-dependent and text-independent speaker verification

Patent number: 11942094

Abstract: A speaker verification method includes receiving audio data corresponding to an utterance, processing a first portion of the audio data that characterizes a predetermined hotword to generate a text-dependent evaluation vector, and generating one or more text-dependent confidence scores. When one of the text-dependent confidence scores satisfies a threshold, the operations include identifying a speaker of the utterance as a respective enrolled user associated with the text-dependent confidence score that satisfies the threshold and initiating performance of an action without performing speaker verification. When none of the text-dependent confidence scores satisfy the threshold, the operations include processing a second portion of the audio data that characterizes a query to generate a text-independent evaluation vector, generating one or more text-independent confidence scores, and determining whether the identity of the speaker of the utterance includes any of the enrolled users.

Type: Grant

Filed: March 24, 2021

Date of Patent: March 26, 2024

Assignee: Google LLC

Inventors: Roza Chojnacka, Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno
Systems and methods for multi-scale pre-training with densely connected transformer

Patent number: 11941356

Abstract: Embodiments described herein propose a densely connected Transformer architecture in which each Transformer layer takes advantages of all previous layers. Specifically, the input for each Transformer layer comes from the outputs of all its preceding layers; and the output information of each layer will be incorporated in all its subsequent layers. In this way, a L-layer Transformer network will have L(L+1)/2 connections. In this way, the dense connection allows the linguistic information learned by the lower layer to be directly propagated to all upper layers and encourages feature reuse throughout the network. Each layer is thus directly optimized from the loss function in the fashion of implicit deep supervision.

Type: Grant

Filed: October 26, 2020

Date of Patent: March 26, 2024

Assignee: Salesforce, Inc.

Inventors: Linqing Liu, Caiming Xiong
Obfuscating audio samples for health privacy contexts

Patent number: 11929063

Abstract: A supervised discriminator for detecting bio-markers in an audio sample dataset is trained and a denoising autoencoder is trained to learn a latent space that is used to reconstruct an output audio sample with a same fidelity as an input audio sample of the audio sample dataset. A conditional auxiliary generative adversarial network (GAN) trained to generate the output audio sample with the same fidelity as the input audio sample, wherein the output audio sample is void of the bio-markers. The conditional auxiliary generative adversarial network (GAN), the corresponding supervised discriminator, and the corresponding denoising autoencoder are deployed in an audio processing system.

Type: Grant

Filed: November 23, 2021

Date of Patent: March 12, 2024

Assignee: International Business Machines Corporation

Inventors: Victor Abayomi Akinwande, Celia Cintas, Komminist Weldemariam, Aisha Walcott
Methods and apparatus to load data within a machine learning accelerator

Patent number: 11922178

Abstract: Methods, apparatus, systems, and articles of manufacture to load data into an accelerator are disclosed. An example apparatus includes data provider circuitry to load a first section and an additional amount of compressed machine learning parameter data into a processor engine. Processor engine circuitry executes a machine learning operation using the first section of compressed machine learning parameter data. A compressed local data re-user circuitry determines if a second section is present in the additional amount of compressed machine learning parameter data. The processor engine circuitry executes a machine learning operation using the second section when the second section is present in the additional amount of compressed machine learning parameter data.

Type: Grant

Filed: June 25, 2021

Date of Patent: March 5, 2024

Assignee: Intel Corporation

Inventors: Arnab Raha, Deepak Mathaikutty, Debabrata Mohapatra, Sang Kyun Kim, Gautham Chinya, Cormac Brick
Method and apparatus for synthesizing multi-speaker speech using artificial neural network

Patent number: 11908447

Abstract: According to an aspect, method for synthesizing multi-speaker speech using an artificial neural network comprises generating and storing a speech learning model for a plurality of users by subjecting a synthetic artificial neural network of a speech synthesis model to learning, based on speech data of the plurality of users, generating speaker vectors for a new user who has not been learned and the plurality of users who have already been learned by using a speaker recognition model, determining a speaker vector having the most similar relationship with the speaker vector of the new user according to preset criteria out of the speaker vectors of the plurality of users who have already been learned, and generating and learning a speaker embedding of the new user by subjecting the synthetic artificial neural network of the speech synthesis model to learning, by using a value of a speaker embedding of a user for the determined speaker vector as an initial value and based on speaker data of the new user.

Type: Grant

Filed: August 4, 2021

Date of Patent: February 20, 2024

Assignee: IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY)

Inventors: Joon Hyuk Chang, Jae Uk Lee
Data transfers in neural processing

Patent number: 11907855

Abstract: A computer implemented method of storing and retrieving feature map data of a neural network the method comprising receiving a first portion of feature map data from local storage, selecting a first set of subportions of the first portion of feature map data, compressing the subportions to produce a first plurality of sections of compressed feature map data and instructing the storage of the sections into external storage. The method also comprises receiving a second plurality of sections of compressed feature map data from the external storage, decompressing the sections to produce a second set of subportions of the second portion of feature map data and storing the second portion of feature map data in local storage. The first and second sets of subportions each correspond to a predetermined format of subdivision and the method comprises selecting the predetermined format of subdivision from a plurality of predetermined formats of subdivision.

Type: Grant

Filed: March 30, 2020

Date of Patent: February 20, 2024

Assignee: Arm Limited

Inventors: Erik Persson, Stefan Johannes Frid, Elliot Maurice Simon Rosemarine
Real time generative audio for brush and canvas interaction in digital drawing

Patent number: 11886768

Abstract: Embodiments are disclosed for real time generative audio for brush and canvas interaction in digital drawing. The method may include receiving a user input and a selection of a tool for generating audio for a digital drawing interaction. The method may further include generating intermediary audio data based on the user input and the tool selection, wherein the intermediary audio data includes a pitch and a frequency. The method may further include processing, by a trained audio transformation model and through a series of one or more layers of the trained audio transformation model, the intermediary audio data. The method may further include adjusting the series of one or more layers of the trained audio transformation model to include one or more additional layers to produce an adjusted audio transformation model. The method may further include generating, by the adjusted audio transformation model, an audio sample based on the intermediary audio data.

Type: Grant

Filed: April 29, 2022

Date of Patent: January 30, 2024

Assignee: Adobe Inc.

Inventors: Pranay Kumar, Nipun Jindal
Approaches to deriving and surfacing insights into conversations in virtual environments and systems for accomplishing the same

Patent number: 11868736

Abstract: Introduced here is a computer program that is representative of a software-implemented collaboration platform that is designed to facilitate conversations in virtual environments, document those conversations, and analyze those conversations, all in real time. The collaboration platform can include or integrate tools for turning ideas—expressed through voice—into templatized, metadata-rich data structures called “knowledge objects.” Discourse throughout a conversation can be converted into a transcription (or simply “transcript”), parsed to identify topical shifts, and then segmented based on the topical shifts. Separately documenting each topic in the form of its own “knowledge object” allows the collaboration platform to not only better catalogue what was discussed in a single ideation session, but also monitor discussion of the same topic over multiple ideation sessions.

Type: Grant

Filed: November 9, 2022

Date of Patent: January 9, 2024

Assignee: Moonbeam, Inc.

Inventors: Nirav S. Desai, Trond Tamaio Nilsen, Philip Roger Lamb
Method and system for providing machine learning service

Patent number: 11868884

Abstract: The present disclosure provides methods and systems for providing machine learning model service. The method may comprise: (a) generating, by a first computing system, a first output data using a first machine learning model, wherein the first machine learning model is trained on a first training dataset; (b) transmitting the first output data to a second computing system, wherein the first training dataset and the first machine learning model are inaccessible to the second computing system; (c) creating an input data by joining the first output data with a selected set of input features accessible to the second computing system; and (d) generating a second output data using a second machine learning model to process the input data.

Type: Grant

Filed: June 17, 2020

Date of Patent: January 9, 2024

Assignee: MOLOCO, INC.

Inventors: Jian Gong Deng, Ikkjin Ahn, Daeseob Lim, Bokyung Choi, Sechan Oh, William Kanaan
Multistream acoustic models with dilations

Patent number: 11862146

Abstract: Audio signals of speech may be processed using an acoustic model. An acoustic model may be implemented with multiple streams of processing where different streams perform processing using different dilation rates. For example, a first stream may process features of the audio signal with one or more convolutional neural network layers having a first dilation rate, and a second stream may process features of the audio signal with one or more convolutional neural network layers having a second dilation rate. Each stream may compute a stream vector, and the stream vectors may be combined to a vector of speech unit scores, where the vector of speech unit scores provides information about the acoustic content of the audio signal. The vector of speech unit scores may be used for any appropriate application of speech, such as automatic speech recognition.

Type: Grant

Filed: July 2, 2020

Date of Patent: January 2, 2024

Assignee: ASAPP, INC.

Inventors: Kyu Jeong Han, Tao Ma, Daniel Povey
High-resolution radio using neural networks

Patent number: 11848748

Abstract: An apparatus and method for enhancing broadcast radio includes a DNN trained on data sets of audio created from a synthesized broadcasting process and original source audio. Broadcast radio signals are received at a radio module and processed through the DNN to produce enhanced audio.

Type: Grant

Filed: December 14, 2020

Date of Patent: December 19, 2023

Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC

Inventors: Joseph Kampeas, Igal Kotzer
Subvocalized speech recognition and command execution by machine learning

Patent number: 11842736

Abstract: Provided is an in-ear device and associated computational support system that leverages machine learning to interpret sensor data descriptive of one or more in-ear phenomena during subvocalization by the user. An electronic device can receive sensor data generated by at least one sensor at least partially positioned within an ear of a user, wherein the sensor data was generated by the at least one sensor concurrently with the user subvocalizing a subvocalized utterance. The electronic device can then process the sensor data with a machine-learned subvocalization interpretation model to generate an interpretation of the subvocalized utterance as an output of the machine-learned subvocalization interpretation model.

Type: Grant

Filed: February 10, 2023

Date of Patent: December 12, 2023

Assignee: Google LLC

Inventors: Yaroslav Volovich, Ant Oztaskent, Blaise Aguera-Arcas
Machine learning-based malware detection system and method

Patent number: 11822657

Abstract: Disclosed is a computer implemented method for malware detection that analyses a file on a per packet basis. The method receives a packet of one or more packets associated a file, and converting a binary content associated with the packet into a digital representation and tokenizing plain text content associated with the packet. The method extracts one or more n-gram features, an entropy feature, and a domain feature from the converted content of the packet and applies a trained machine learning model to the one or more features extracted from the packet. The output of the machine learning method is a probability of maliciousness associated with the received packet. If the probability of maliciousness is above a threshold value, the method determines that the file associated with the received packet is malicious.

Type: Grant

Filed: April 20, 2022

Date of Patent: November 21, 2023

Assignee: Zscaler, Inc.

Inventors: Huihsin Tseng, Hao Xu, Jian L Zhen
Learning device, learning method, learning program, retrieval device, retrieval method, and retrieval program

Patent number: 11817081

Abstract: A learning device calculates an image feature using a model (image encoder) that receives an image and outputs the image feature obtained by mapping the image into a latent space. The learning device calculates an audio feature using a model (audio encoder) that receives a speech in a predetermined language and outputs the audio feature obtained by mapping the speech into the latent space, and that includes a neural network provided with a self-attention mechanism. The learning device updates parameters of the models used by an image feature calculation unit and an audio feature calculation unit such that the image feature of a first image is similar to the audio feature of a speech corresponding to the first image.

Type: Grant

Filed: March 31, 2021

Date of Patent: November 14, 2023

Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
Computer implemented method and apparatus for recognition of speech patterns and feedback

Patent number: 11810471

Abstract: A computer system analyses audio data representing a user speaking words from a body of text and identifies occasions where the user mispronounces an expected phoneme. Mispronunciation of the expected phoneme is identified by comparison with a phonetic sequence corresponding to the text, based on a predetermined or user-selected language model. The system requires the user to read continuously for a period of time, so that the user cannot hide any tendency they have to pronounce the words of the text either incorrectly or differently to the expected phonemes from the language model. The system operates on the basis of comparing the similarity of the spoken sounds of the user with the expected phonemes for the body of text, and it is not necessary to convert the user's speech to text. As the computer system need only work with the similarity scores and the sequence of expected phonemes, it can be implemented in a computationally efficient manner.

Type: Grant

Filed: May 13, 2019

Date of Patent: November 7, 2023

Assignee: SPEECH ENGINEERING LIMITED

Inventor: David Matthew Karas
Artificial intelligence system for sequence-to-sequence processing with attention adapted for streaming applications

Patent number: 11810552

Abstract: The present disclosure provides an artificial intelligence (AI) system for sequence-to-sequence modeling with attention adapted for streaming applications. The AI system comprises at least one processor; and memory having instructions stored thereon that, when executed by the processor, cause the AI system to process each input frame in a sequence of input frames through layers of a deep neural network (DNN) to produce a sequence of outputs. At least some of the layers of the DNN include a dual self-attention module having a dual non-causal and causal architecture attending to non-causal frames and causal frames. Further, the AI system renders the sequence of outputs.

Type: Grant

Filed: July 2, 2021

Date of Patent: November 7, 2023

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Niko Moritz, Takaaki Hori, Jonathan Le Roux
Neural programming

Patent number: 11803746

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural programming. One of the methods includes processing a current neural network input using a core recurrent neural network to generate a neural network output; determining, from the neural network output, whether or not to end a currently invoked program and to return to a calling program from the set of programs; determining, from the neural network output, a next program to be called; determining, from the neural network output, contents of arguments to the next program to be called; receiving a representation of a current state of the environment; and generating a next neural network input from an embedding for the next program to be called and the representation of the current state of the environment.

Type: Grant

Filed: April 27, 2020

Date of Patent: October 31, 2023

Assignee: DeepMind Technologies Limited

Inventors: Scott Ellison Reed, Joao Ferdinando Gomes de Freitas
Generating training datasets for a supervised learning topic model from outputs of a discovery topic model

Patent number: 11804216

Abstract: Systems and methods for generating training data for a supervised topic modeling system from outputs of a topic discovery model are described herein. In an embodiment, a system receives a plurality of digitally stored call transcripts and, using a topic model, generates an output which identifies a plurality of topics represented in the plurality of digitally stored call transcripts. Using the output of the topic model, the system generates an input dataset for a supervised learning model by identify a first subset of the plurality of digitally stored call transcripts that include the particular topic, storing a positive value for the first subset, identifying a second subset that do not include the particular topic, and storing a negative value for the second subset. The input training dataset is then used to train a supervised learning model.

Type: Grant

Filed: August 3, 2022

Date of Patent: October 31, 2023

Assignee: Invoca, Inc.

Inventors: Michael McCourt, Anoop Praturu
Machine learning based models for automatic conversations in online systems

Patent number: 11790894

Abstract: A system uses conversation engines to process natural language requests and conduct automatic conversations with users. The system generates responses to users in an online conversation. The system ranks generated user responses for the online conversation. The system generates a context vector based on a sequence of utterances of the conversation and generates response vectors for generated user responses. The system ranks the user responses based on a comparison of the context vectors and user response vectors. The system uses a machine learning based model that uses a pretrained neural network that supports multiple languages. The system determines a context of an utterance based on utterances in the conversation. The system generates responses and ranks them based on the context. The ranked responses are used to respond to the user.

Type: Grant

Filed: March 15, 2021

Date of Patent: October 17, 2023

Assignee: Salesforce, Inc.

Inventors: Yixin Mao, Zachary Alexander, Victor Winslow Yee, Joseph R. Zeimen, Na Cheng, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
Speaker separation based on real-time latent speaker state characterization

Patent number: 11790921

Abstract: Systems, methods, and non-transitory computer-readable media can obtain a stream of audio waveform data that represents speech involving a plurality of speakers. As the stream of audio waveform data is obtained, a plurality of audio chunks can be determined. An audio chunk can be associated with one or more identity embeddings. The stream of audio waveform data can be segmented into a plurality of segments based on the plurality of audio chunks and respective identity embeddings associated with the plurality of audio chunks. A segment can be associated with a speaker included in the plurality of speakers. Information describing the plurality of segments associated with the stream of audio waveform data can be provided.

Type: Grant

Filed: February 8, 2021

Date of Patent: October 17, 2023

Assignee: OTO Systems Inc.

Inventors: Valentin Alain Jean Perret, Nándor Kedves, Nicolas Lucien Perony
Multi-domain joint semantic frame parsing

Patent number: 11783173

Abstract: A processing unit can train a model as a joint multi-domain recurrent neural network (JRNN), such as a bi-directional recurrent neural network (bRNN) and/or a recurrent neural network with long-short term memory (RNN-LSTM) for spoken language understanding (SLU). The processing unit can use the trained model to, e.g., jointly model slot filling, intent determination, and domain classification. The joint multi-domain model described herein can estimate a complete semantic frame per query, and the joint multi-domain model enables multi-task deep learning leveraging the data from multiple domains. The joint multi-domain recurrent neural network (JRNN) can leverage semantic intents (such as, finding or identifying, e.g., a domain specific goal) and slots (such as, dates, times, locations, subjects, etc.) across multiple domains.

Type: Grant

Filed: August 4, 2016

Date of Patent: October 10, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dilek Z Hakkani-Tur, Asli Celikyilmaz, Yun-Nung Chen, Li Deng, Jianfeng Gao, Gokhan Tur, Ye-Yi Wang
Accuracy of streaming RNN transducer

Patent number: 11783811

Abstract: A computer-implemented method is provided for model training. The method includes training a second end-to-end neural speech recognition model that has a bidirectional encoder to output same symbols from an output probability lattice of the second end-to-end neural speech recognition model as from an output probability lattice of a trained first end-to-end neural speech recognition model having a unidirectional encoder. The method also includes building a third end-to-end neural speech recognition model that has a unidirectional encoder by training the third end-to-end neural speech recognition model as a student by using the trained second end-to-end neural speech recognition model as a teacher in a knowledge distillation method.

Type: Grant

Filed: September 24, 2020

Date of Patent: October 10, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Gakuto Kurata, George Andrei Saon
Automatically determining configurations for executing recurrent neural networks

Patent number: 11769035

Abstract: Techniques are described automatically determining runtime configurations used to execute recurrent neural networks (RNNs) for training or inference. One such configuration involves determining whether to execute an RNN in a looped, or “rolled,” execution pattern or in a non-looped, or “unrolled,” execution pattern. Execution of an RNN using a rolled execution pattern generally consumes less memory resources than execution using an unrolled execution pattern, whereas execution of an RNN using an unrolled execution pattern typically executes faster. The configuration choice thus involves a time-memory tradeoff that can significantly affect the performance of the RNN execution. This determination is made automatically by a machine learning (ML) runtime by analyzing various factors such as, for example, a type of RNN being executed, the network structure of the RNN, characteristics of the input data to the RNN, an amount of computing resources available, and so forth.

Type: Grant

Filed: December 13, 2018

Date of Patent: September 26, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Lai Wei, Hagay Lupesko, Anirudh Acharya, Ankit Khedia, Sandeep Krishnamurthy, Cheng-Che Lee, Kalyanee Shriram Chendke, Vandana Kannan, Roshani Nagmote
Neural network mapping method and apparatus

Patent number: 11769044

Abstract: A neural network mapping method and a neural network mapping apparatus are provided. The method includes: mapping a calculation task for a preset feature map of each network layer in a plurality of network layers in a convolutional neural network to at least one processing element of a chip; acquiring the number of phases needed by a plurality of processing elements in the chip for completing the calculation tasks, and performing a first stage of balancing on the number of phases of the plurality of processing elements; and based on the number of the phases of the plurality of processing elements obtained after the first stage of balancing, mapping the calculation task for the preset feature map of each network layer in the plurality of network layers in the convolutional neural network to at least one processing element of the chip subjected to the first stage of balancing.

Type: Grant

Filed: October 27, 2020

Date of Patent: September 26, 2023

Assignee: LYNXI TECHNOLOGIES CO., LTD.

Inventors: Weihao Zhang, Han Li, Chuan Hu, Yaolong Zhu
Hearing aid with variable number of channels and method of switching number of channels of hearing aid

Patent number: 11765524

Abstract: A hearing aid with the variable number of channels includes: a microphone that receives a sound signal; an AD converter that converts the sound signal input from the microphone into a digital signal and outputs the converted digital signal; a controller that determines a filter bank channel for processing the digital signal output from the AD converter; a buffer unit that delays the digital signal based on the determined filter bank channel; a signal processor that includes at least one filter bank channel, synthesizes the digital signal using the determined filter bank channel and outputs the synthesized digital signal; a DA converter that converts the digital signal into the sound signal and outputs the converted sound signal; and a speaker that outputs the sound signal output from the DA converter, in which the controller determines the filter bank channel for processing the digital signal based on a preset condition.

Type: Grant

Filed: May 11, 2023

Date of Patent: September 19, 2023

Assignee: Korea Photonics Technology Institute

Inventors: Seon Man Kim, Kwang Hoon Lee
Method and apparatus for speech recognition, and storage medium

Patent number: 11756529

Abstract: Proposed are a method and apparatus for speech recognition, and a storage medium. The specific solution includes: obtaining audio data to be recognized; decoding the audio data to obtain a first syllable of a to-be-converted word, in which the first syllable is a combination of at least one phoneme corresponding to the to-be-converted word; obtaining a sentence to which the to-be-converted word belongs and a converted word in the sentence, and obtaining a second syllable of the converted word; encoding the first syllable and the second syllable to generate first encoding information of the first syllable; and decoding the first encoding information to obtain a text corresponding to the to-be-converted word.

Type: Grant

Filed: December 16, 2020

Date of Patent: September 12, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Liao Zhang, Xiaoyin Fu, Zhengxiang Jiang, Mingxin Liang, Junyao Shao, Qi Zhang, Zhijie Chen, Qiguang Zang
Method for speech recognition with grapheme information

Patent number: 11749260

Abstract: Disclosed is a method for speech recognition performed by one or more processors of a computing device. The method includes inputting voice information into an encoder to extract a first feature vector and calculating a first loss function. The method includes inputting the first feature vector extracted from the encoder to a first decoder to perform prediction on the voice information, calculating a second loss function, and extracting a second feature vector. The method includes inputting a second feature vector extracted from the first decoder to a second decoder to perform grapheme-based prediction, and calculating a third loss function. The method includes training at least one of the encoder, the first decoder, or the second decoder based on the first loss function, the second loss function, and the third loss function.

Type: Grant

Filed: September 23, 2022

Date of Patent: September 5, 2023

Assignee: ACTIONPOWER CORP.

Inventors: Hwanbok Mun, Dongchan Shin, Gyujin Kim, Seongmin Park, Jihwa Lee
Total correlation variational autoencoder strengthened with attentions for segmenting syntax and semantics

Patent number: 11748567

Abstract: Described herein are embodiments of a framework named as total correlation variational autoencoder (TC_VAE) to disentangle syntax and semantics by making use of total correlation penalties of KL divergences. One or more Kullback-Leibler (KL) divergence terms in a loss for a variational autoencoder are discomposed so that generated hidden variables may be separated. Embodiments of the TC_VAE framework were examined on semantic similarity tasks and syntactic similarity tasks. Experimental results show that better disentanglement between syntactic and semantic representations have been achieved compared with state-of-the-art (SOTA) results on the same data sets in similar settings.

Type: Grant

Filed: July 10, 2020

Date of Patent: September 5, 2023

Assignee: Baidu USA LLC

Inventors: Dingcheng Li, Shaogang Ren, Ping Li
Automated population of deep-linked interfaces during programmatically established chatbot sessions

Patent number: 11743210

Abstract: The disclosed exemplary embodiments include computer-implemented apparatuses and processes that automatically populate deep-linked interfaces based n programmatically established chatbot sessions. For example, an apparatus may determine a candidate parameter value for a first parameter of an exchange of data based on received messaging information and on information characterizing prior exchanges of data between a device and the apparatus. The apparatus may also generate interface data that associates the first candidate parameter value with a corresponding interface element of a first digital interface, and may store the store interface data within a data repository. In some instances, the apparatus may transmit linking data associated with the stored interface data to the device, and an application program executed by the device may present a representation of the linking data within a second digital interface.

Type: Grant

Filed: April 3, 2020

Date of Patent: August 29, 2023

Assignee: The Toronto-Dominion Bank

Inventors: Tae Gyun Moon, Robert Alexander McCarter, Kheiver Kayode Roberts
Tied and reduced RNN-T

Patent number: 11727920

Abstract: A RNN-T model includes a prediction network configured to, at each of a plurality of times steps subsequent to an initial time step, receive a sequence of non-blank symbols. For each non-blank symbol the prediction network is also configured to generate, using a shared embedding matrix, an embedding of the corresponding non-blank symbol, assign a respective position vector to the corresponding non-blank symbol, and weight the embedding proportional to a similarity between the embedding and the respective position vector. The prediction network is also configured to generate a single embedding vector at the corresponding time step. The RNN-T model also includes a joint network configured to, at each of the plurality of time steps subsequent to the initial time step, receive the single embedding vector generated as output from the prediction network at the corresponding time step and generate a probability distribution over possible speech recognition hypotheses.

Type: Grant

Filed: May 26, 2021

Date of Patent: August 15, 2023

Assignee: Google LLC

Inventors: Rami Botros, Tara Sainath
Transformer-based automatic speech recognition system incorporating time-reduction layer

Patent number: 11715461

Abstract: Computer implemented method and system for automatic speech recognition. A first speech sequence is processed, using a time reduction operation of an encoder NN, into a second speech sequence comprising a second set of speech frame feature vectors that each concatenate information from a respective plurality of speech frame feature vectors included in the first set and includes fewer speech frame feature vectors than the first speech sequence. The second speech sequence is transformed, using a self-attention operation of the encoder NN, into a third speech sequence comprising a third set of speech frame feature vectors. The third speech sequence is processed using a probability operation of the encoder NN, to predict a sequence of first labels corresponding to the third set of speech frame feature vectors, and using a decoder NN to predict a sequence of second labels corresponding to the third set of speech frame feature vectors.

Type: Grant

Filed: October 21, 2020

Date of Patent: August 1, 2023

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Md Akmal Haidar, Chao Xing
Autonomous driving assistance device, autonomous driving assistance system, and autonomous driving assistance method

Patent number: 11685395

Abstract: An autonomous driving assistance device includes: a determination unit for determining whether or not a driver of a vehicle needs a rest on the basis of detection information of a state of the driver; and a control unit for causing an output device of the vehicle to output a pattern that prompts the driver to sleep through at least one of sight, hearing, or touch during a period in which the vehicle has started moving to a parking area and is parked at the parking area in a case where the determination unit has determined that the driver needs a rest.

Type: Grant

Filed: November 18, 2021

Date of Patent: June 27, 2023

Assignee: MITSUBISHI ELECTRIC CORPORATION

Inventors: Misato Yuasa, Shinsaku Fukutaka, Munetaka Nishihira, Akiko Imaishi, Tsuyoshi Sempuku
Using back propagation computation as data

Patent number: 11676026

Abstract: Computer-implemented, machine-learning systems and methods relate to a neural network having at least two subnetworks, i.e., a first subnetwork and a second subnetwork. The systems and methods estimate the partial derivative(s) of an objective with respect to (i) an output activation of a node in first subnetwork, (ii) the input to the node, and/or (iii) the connection weights to the node. The estimated partial derivative(s) are stored in a data store and provided as input to the second subnetwork. Because the estimated partial derivative(s) are persisted in a data store, the second subnetwork has access to them even after the second subnetwork has gone through subsequent training iterations. Using this information, subnetwork 160 can compute classifications and regression functions that can help, for example, in the training of the first subnetwork.

Type: Grant

Filed: June 4, 2019

Date of Patent: June 13, 2023

Assignee: D5AI LLC

Inventor: James K. Baker
Unified endpointer using multitask and multidomain learning

Patent number: 11676625

Abstract: A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.

Type: Grant

Filed: January 20, 2021

Date of Patent: June 13, 2023

Assignee: Google LLC

Inventors: Shuo-Yiin Chang, Bo Li, Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
Parameter-efficient multi-task and transfer learning

Patent number: 11676008

Abstract: The present disclosure provides systems and methods that enable parameter-efficient transfer learning, multi-task learning, and/or other forms of model re-purposing such as model personalization or domain adaptation. In particular, as one example, a computing system can obtain a machine-learned model that has been previously trained on a first training dataset to perform a first task. The machine-learned model can include a first set of learnable parameters. The computing system can modify the machine-learned model to include a model patch, where the model patch includes a second set of learnable parameters. The computing system can train the machine-learned model on a second training dataset to perform a second task that is different from the first task, which may include learning new values for the second set of learnable parameters included in the model patch while keeping at least some (e.g., all) of the first set of parameters fixed.

Type: Grant

Filed: September 20, 2019

Date of Patent: June 13, 2023

Assignee: GOOGLE LLC

Inventors: Mark Sandler, Andrey Zhmoginov, Andrew Gerald Howard, Pramod Kaushik Mudrakarta
Methods and systems for estimation of obstructive sleep apnea severity in wake subjects by multiple speech analyses

Patent number: 11672472

Abstract: Provided herein is a method and system for the estimation of apnea-hypopnea index (AHI), as an indicator for Obstructive sleep apnea (OSA) severity, by combining speech descriptors from three separate and distinct speech signal domains. These domains include the acoustic short-term features (STF) of continuous speech, the long-term features (LTF) of continuous speech, and features of sustained vowels (SVF). Combining these speech descriptors may provide the ability to estimate the severity of OSA using statistical learning and speech analysis approaches.

Type: Grant

Filed: July 10, 2017

Date of Patent: June 13, 2023

Assignees: B.G. NEGEV TECHNOLOGIES AND APPLICATIONS LTD., AT BEN-GURION UNIVERSITY, MOR RESEARCH APPLICATIONS LTD.

Inventors: Yaniv Zigel, Dvir Ben Or, Ariel Tarasiuk, Eliran Dafna
Method and system for learning and using latent-space representations of audio signals for audio content-based retrieval

Patent number: 11670322

Abstract: A method and system are provided for extracting features from digital audio signals which exhibit variations in pitch, timbre, decay, reverberation, and other psychoacoustic attributes and learning, from the extracted features, an artificial neural network model for generating contextual latent-space representations of digital audio signals. A method and system are also provided for learning an artificial neural network model for generating consistent latent-space representations of digital audio signals in which the generated latent-space representations are comparable for the purposes of determining psychoacoustic similarity between digital audio signals.

Type: Grant

Filed: July 29, 2020

Date of Patent: June 6, 2023

Assignee: Distributed Creation Inc.

Inventors: Alejandro Koretzky, Naveen Sasalu Rajashekharappa
Multi-factor integrated compliance determination and enforcement platform

Patent number: 11625769

Abstract: A compliance determination and enforcement platform is described. A plurality of factors are stored in association with each of a plurality of accounts. A factor entering module enters factors from each user account into a compliance score model. The compliance score model determines a compliance score for each one of the accounts based on the respective factors associated with the respective account. A comparator compares the compliance score for each account with a compliance reference score to determine a subset of the accounts that fail compliance and a subset of the accounts that meet compliance. A flagging unit flags the user accounts that fail compliance to indicate non-compliant accounts. A corrective action system allows for determining, for each one of the accounts that is flagged as non-compliant, whether the account is bad or good, entering the determination into a feedback system and closing the account.

Type: Grant

Filed: September 21, 2016

Date of Patent: April 11, 2023

Assignee: Coinbase, Inc.

Inventors: Bradley J. Larson, Linda Xie, Paul Jabaay, Jeffrey B. Kern
Recurrent neural networks for online sequence generation

Patent number: 11625572

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from a source sequence. In one aspect, the system includes a recurrent neural network configured to, at each time step, receive an input for the time step and process the input to generate a progress score and a set of output scores; and a subsystem configured to, at each time step, generate the recurrent neural network input and provide the input to the recurrent neural network; determine, from the progress score, whether or not to emit a new output at the time step; and, in response to determining to emit a new output, select an output using the output scores and emit the selected output as the output at a next position in the output order.

Type: Grant

Filed: May 3, 2018

Date of Patent: April 11, 2023

Assignee: Google LLC

Inventors: Chung-Cheng Chiu, Navdeep Jaitly, John Dieterich Lawson, George Jay Tucker
Processing natural language using machine learning to determine slot values based on slot descriptors

Patent number: 11610579

Abstract: Determining slot value(s) based on received natural language input and based on descriptor(s) for the slot(s). In some implementations, natural language input is received as part of human-to-automated assistant dialog. A natural language input embedding is generated based on token(s) of the natural language input. Further, descriptor embedding(s) are generated (or received), where each of the descriptor embeddings is generated based on descriptor(s) for a corresponding slot that is assigned to a domain indicated by the dialog. The natural language input embedding and the descriptor embedding(s) are applied to layer(s) of a neural network model to determine, for each of the slot(s), which token(s) of the natural language input correspond to the slot. A command is generated that includes slot value(s) for slot(s), where the slot value(s) for one or more of slot(s) are determined based on the token(s) determined to correspond to the slot(s).

Type: Grant

Filed: June 18, 2017

Date of Patent: March 21, 2023

Assignee: GOOGLE LLC

Inventors: Ankur Bapna, Larry Paul Heck
Cooperative neural network for recommending next user action

Patent number: 11599768

Abstract: A method for recommending an action to a user of a user device includes receiving first user action data corresponding to a first user action and receiving second user action data corresponding to a second user action. The method also includes generating, based on the first user action data and the second user action data and using a feedforward artificial neural network, a recommendation for a next user action. The method also includes causing the recommendation for the next user action to be communicated to the user device.

Type: Grant

Filed: July 18, 2019

Date of Patent: March 7, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Kai Niu, Jiali Huang, Christopher H. Doan, Michael D. Elder
Asynchronously training machine learning models across client devices for adaptive intelligence

Patent number: 11593634

Abstract: This disclosure relates to methods, non-transitory computer readable media, and systems that asynchronously train a machine learning model across client devices that implement local versions of the model while preserving client data privacy. To train the model across devices, in some embodiments, the disclosed systems send global parameters for a global machine learning model from a server device to client devices. A subset of the client devices uses local machine learning models corresponding to the global model and client training data to modify the global parameters. Based on those modifications, the subset of client devices sends modified parameter indicators to the server device for the server device to use in adjusting the global parameters. By utilizing the modified parameter indicators (and not client training data), in certain implementations, the disclosed systems accurately train a machine learning model without exposing training data from the client device.

Type: Grant

Filed: June 19, 2018

Date of Patent: February 28, 2023

Assignee: Adobe Inc.

Inventors: Sunav Choudhary, Saurabh Kumar Mishra, Manoj Ghuhan A, Ankur Garg
Device component management using deep learning techniques

Patent number: 11586964

Abstract: Methods, apparatus, and processor-readable storage media for device component management using deep learning techniques are provided herein. An example computer-implemented method includes obtaining telemetry data from one or more enterprise devices; determining, for each of the one or more enterprise devices, values for multiple device attributes by processing the obtained telemetry data; generating, for each of the one or more enterprise devices, at least one prediction related to lifecycle information of at least one device component by processing the determined attribute values using one or more deep learning techniques; and performing one or more automated actions based at least in part on the at least one generated prediction.

Type: Grant

Filed: January 30, 2020

Date of Patent: February 21, 2023

Assignee: Dell Products L.P.

Inventors: Parminder Singh Sethi, Akanksha Goel, Hung T. Dinh, Sabu K. Syed, James S. Watt, Kannappan Ramu
Machine learning for interpretation of subvocalizations

Patent number: 11580978

Abstract: Provided is an in-ear device and associated computational support system that leverages machine learning to interpret sensor data descriptive of one or more in-ear phenomena during subvocalization by the user. An electronic device can receive sensor data generated by at least one sensor at least partially positioned within an ear of a user, wherein the sensor data was generated by the at least one sensor concurrently with the user subvocalizing a subvocalized utterance. The electronic device can then process the sensor data with a machine-learned subvocalization interpretation model to generate an interpretation of the subvocalized utterance as an output of the machine-learned subvocalization interpretation model.

Type: Grant

Filed: November 24, 2020

Date of Patent: February 14, 2023

Assignee: Google LLC

Inventors: Yaroslav Volovich, Ant Oztaskent, Blaise Aguera-Arcas
Method for training speech recognition model, method and system for speech recognition

Patent number: 11580957

Abstract: Disclosed are a method for training speech recognition model, a method and a system for speech recognition. The disclosure relates to field of speech recognition and includes: inputting an audio training sample into the acoustic encoder to represent acoustic features of the audio training sample in an encoded way and determine an acoustic encoded state vector; inputting a preset vocabulary into the language predictor to determine text prediction vector; inputting the text prediction vector into the text mapping layer to obtain a text output probability distribution; calculating a first loss function according to a target text sequence corresponding to the audio training sample and the text output probability distribution; inputting the text prediction vector and the acoustic encoded state vector into the joint network to calculate a second loss function, and performing iterative optimization according to the first loss function and the second loss function.

Type: Grant

Filed: June 9, 2022

Date of Patent: February 14, 2023

Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventors: Jianhua Tao, Zhengkun Tian, Jiangyan Yi
Method and apparatus for evaluating user intention understanding satisfaction, electronic device and storage medium

Patent number: 11580980

Abstract: A method and apparatus for generating a user intention understanding satisfaction evaluation model, a method and apparatus for evaluating a user intention understanding satisfaction, an electronic device and a storage medium are provided, relating to intelligent voice recognition and knowledge graphs.

Type: Grant

Filed: January 22, 2021

Date of Patent: February 14, 2023

Inventors: Yanyan Li, Jianguo Duan, Hui Xiong
Distributed training for deep learning models

Patent number: 11574253

Abstract: A computer implemented method trains distributed sets of machine learning models by training each of the distributed machine learning models on different subsets of a set of training data, performing a first layer model synchronization operation in a first layer for each set of machine learning models, wherein each model synchronization operation in the first layer generates first updates for each of the machine learning models in each respective set, updating the machine learning models based on the first updates, performing a second layer model synchronization operation in a second layer for first supersets of the machine learning models wherein each model synchronization in the second layer generates second updates for updating each of the machine learning models in the first supersets based on the second updates such that each machine learning model in a respective first superset is the same.

Type: Grant

Filed: August 1, 2019

Date of Patent: February 7, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Ivo José Garcia dos Santos, Mehdi Aghagolzadeh, Rihui Peng

1 2 3 4 5 … next