Neural Network Patents (Class 704/232)

Distributed training for deep learning models

Patent number: 11574253

Abstract: A computer implemented method trains distributed sets of machine learning models by training each of the distributed machine learning models on different subsets of a set of training data, performing a first layer model synchronization operation in a first layer for each set of machine learning models, wherein each model synchronization operation in the first layer generates first updates for each of the machine learning models in each respective set, updating the machine learning models based on the first updates, performing a second layer model synchronization operation in a second layer for first supersets of the machine learning models wherein each model synchronization in the second layer generates second updates for updating each of the machine learning models in the first supersets based on the second updates such that each machine learning model in a respective first superset is the same.

Type: Grant

Filed: August 1, 2019

Date of Patent: February 7, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Ivo José Garcia dos Santos, Mehdi Aghagolzadeh, Rihui Peng
Data driven mixed precision learning for neural networks

Patent number: 11568235

Abstract: Embodiments for implementing mixed precision learning for neural networks by a processor. A neural network may be replicated into a plurality of replicated instances and each of the plurality of replicated instances differ in precision used for representing and determining parameters of the neural network. Data instances may be routed to one or more of the plurality of replicated instances for processing according to a data pre-processing operation.

Type: Grant

Filed: November 19, 2018

Date of Patent: January 31, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Zehra Sura, Parijat Dube, Bishwaranjan Bhattacharjee, Tong Chen
Systems and methods for automatic speech recognition based on graphics processing units

Patent number: 11562734

Abstract: The present disclosure relates to an automatic speech recognition system and a method thereof. The system includes a conformer encoder and a pair of ping-pong buffers. The encoder includes a plurality of encoder layers sequentially executed by one or more graphic processing units. At least one encoder layer includes a first feed forward module, a multi-head self-attention module, a convolution module, and a second feed forward module. The convolution module and the multi-head self-attention module are sandwiched between the first feedforward module and the second feed forward module. The four modules respectively include a plurality of encoder sublayers fused into one or more encoder kernels. The one or more encoder kernels respectively read from one of the pair of ping-pong buffers and write into the other of the pair of ping-pong buffers.

Type: Grant

Filed: January 4, 2021

Date of Patent: January 24, 2023

Assignee: KWAI INC.

Inventors: Yongxiong Ren, Yang Liu, Heng Liu, Lingzhi Liu, Jie Li, Kaituo Xu, Xiaorui Wang
Speech command verification

Patent number: 11557292

Abstract: A system and method performs speech command verification to determine if audio data includes a representation of a speech command. A first neural network may process portions of the audio data before and after a representation of a wake trigger in the audio data. A second neural network may process the audio data using a recurrent neural network to determine if the audio data includes a representation of a wake trigger.

Type: Grant

Filed: December 9, 2020

Date of Patent: January 17, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Joseph Wang, Michael J Rodehorst, Rajath Kumar Mysore Pradeep Kumar
Speech processing method, information device, and computer program product

Patent number: 11551707

Abstract: Disclosed is a method for speech processing, an information device, and a computer program product. The method for speech processing, as implemented by a computer, includes: obtaining a mixed speech signal via a microphone, wherein the mixed speech signal includes a plurality of speech signals uttered by a plurality of unspecified speakers at the same time; generating a set of simulated speech signals according to the mixed speech signal by using a Generative Adversarial Network (GAN), in order to simulate the plurality of speech signals; determining the number of the simulated speech signals in order to estimate the number of the speakers in the surroundings and providing the number as an input of an information application.

Type: Grant

Filed: August 27, 2019

Date of Patent: January 10, 2023

Assignee: RELAJET TECH (TAIWAN) CO., LTD.

Inventors: Yun-Shu Hsu, Po-Ju Chen
Extracting entity relations from semi-structured information

Patent number: 11514091

Abstract: Methods and systems for processing records include extracting feature vectors from words in an unstructured portion of a record. The feature vectors are weighted based similarity to a topic vector from a structured portion of the record associated with the unstructured portion. The weighted feature vectors are classified using a machine learning model to determine respective probability vectors that assign a probability to each of a set of possible relations for each feature vector. Relations between entities are determined within the record based on the probability vectors. An action is performed responsive to the determined relations.

Type: Grant

Filed: January 7, 2019

Date of Patent: November 29, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ke Wang, Pei Ni Liu, Wen Sun, Jing Min Xu, Songfang Huang, Yong Qin
Multi-source transfer learning from pre-trained networks

Patent number: 11514318

Abstract: Examples described herein provide a computer-implemented method that includes training, by one or more processing devices, a first neural network for classification based on training data in accordance with a first learning objective, the first neural network producing an intermediate feature function and a final feature function as outputs. The computer-implemented method further includes training, by the one or more processing devices, a second neural network for classification based on the intermediate feature function and the final feature function and further based at least in part on target task samples in accordance with a second learning objective. Training the second neural network includes computing maximal correlation functions of each of the intermediate feature function, the final feature function, and the target task samples.

Type: Grant

Filed: April 8, 2020

Date of Patent: November 29, 2022

Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Inventors: Joshua Ka-Wing Lee, Prasanna Sattigeri, Gregory Wornell
Acquiring speech features for predicting emotional severity of adverse events on individuals

Patent number: 11508396

Abstract: Systems and methods of related to a voice-based system used to determine the severity of emotional distress within an audio recording of an individual is provided. In one non-limiting example, a system comprises a computing device that is configured to receive an audio sample that includes an utterance of a user. Feature extraction is performed on the audio sample to extract a plurality of acoustic emotion features using a base model. Emotion level predictions are generated for an emotion type based at least in part on the acoustic emotion features provided to an emotion specific model. An emotion classification for the audio sample is determined based on the emotion level predictions. The emotion classification comprises the emotion type and a level associated with the emotion type.

Type: Grant

Filed: December 14, 2021

Date of Patent: November 22, 2022

Assignee: TQINTELLIGENCE, INC.

Inventors: Yared Alemu, Desmond Caulley, Ashutosh A. Joshi
Scalable artificial intelligence model generation systems and methods for healthcare

Patent number: 11507822

Abstract: Systems and methods to generate artificial intelligence models with synthetic data are disclosed. An example system includes a deep neural network (DNN) generator to generate a first DNN model using first real data. The example system includes a synthetic data generator to generate first synthetic data from the first real data, the first synthetic data to be used by the DNN generator to generate a second DNN model. The example system includes an evaluator to evaluate performance of the first and second DNN models to determine whether to generate second synthetic data. The example system includes a synthetic data aggregator to aggregate third synthetic data and fourth synthetic data from a plurality of sites to form a synthetic data set. The example system includes an artificial intelligence model deployment processor to deploy an artificial intelligence model trained and tested using the synthetic data set.

Type: Grant

Filed: October 31, 2018

Date of Patent: November 22, 2022

Assignee: General Electric Company

Inventors: Ravi Soni, Min Zhang, Gopal Avinash
System for creating speaker model based on vocal sounds for a speaker recognition system, computer program product, and controller, using two neural networks

Patent number: 11495235

Abstract: According to one embodiment, a system for creating a speaker model includes one or more processors. The processors change a part of network parameters from an input layer to a predetermined intermediate layer based on a plurality of patterns and inputs a piece of speech into each of neural networks so as to obtain a plurality of outputs from the intermediate layer. The part of network parameters of the each of the neural networks is changed based on one of the plurality of patterns. The processors create a speaker model with respect to one or more words detected from the speech based on the outputs.

Type: Grant

Filed: March 8, 2019

Date of Patent: November 8, 2022

Assignee: Kabushiki Kaisha Toshiba

Inventor: Hiroshi Fujimura
Implementing a correction model to reduce propagation of automatic speech recognition errors

Patent number: 11462208

Abstract: Some techniques described herein determine a correction model for a dialog system, such that the correction model corrects output from an automatic speech recognition (ASR) subsystem in the dialog system. A method described herein includes accessing training data. A first tuple of the training data includes an utterance, where the utterance is a textual representation of speech. The method further includes using an ASR subsystem of a dialog system to convert the utterance to an output utterance. The method further includes storing the output utterance in corrective training data that is based on the training data. The method further includes training a correction model based on the corrective training data, such that the correction model is configured to correct output from the ASR subsystem during operation of the dialog system.

Type: Grant

Filed: August 13, 2020

Date of Patent: October 4, 2022

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Thanh Long Duong, Mark Edward Johnson
Apparatus for processing a signal

Patent number: 11461642

Abstract: An apparatus for processing a signal for input to a neural network, the apparatus configured to: receive a signal comprising a plurality of samples of an analog signal over time; determine at least one frame comprising a group of consecutive samples of the signal, wherein the or each frame includes a first number of samples; for each frame, determine a set of correlation values comprising a second number of correlation values, the second number less than the first number, each correlation value of the set of correlation values based on an autocorrelation of the frame at a plurality of different time lags; provide an output based on the set of correlation values corresponding to the or each of the frames for a neural network for one or more of classification of the analog signal by the neural network and training the neural network based on a predetermined classification.

Type: Grant

Filed: September 11, 2019

Date of Patent: October 4, 2022

Assignee: NXP B.V.

Inventors: Jose De Jesus Pineda de Gyvez, Hamed Fatemi, Emad Ayman Taleb Ibrahim
Spoken language understanding

Patent number: 11450310

Abstract: Systems and methods for spoken language understanding are described. Embodiments of the systems and methods receive audio data for a spoken language expression, encode the audio data using a multi-stage encoder comprising a basic encoder and a sequential encoder, wherein the basic encoder is trained to generate character features during a first training phase and the sequential encoder is trained to generate token features during a second training phase, and decode the token features to generate semantic information representing the spoken language expression.

Type: Grant

Filed: August 10, 2020

Date of Patent: September 20, 2022

Assignee: ADOBE INC.

Inventors: Nikita Kapoor, Jaya Dodeja, Nikaash Puri
Artificial intelligence based audio coding

Patent number: 11437050

Abstract: Techniques are described for coding audio signals. For example, using a neural network, a residual signal is generated for a sample of an audio signal based on inputs to the neural network. The residual signal is configured to excite a long-term prediction filter and/or a short-term prediction filter. Using the long-term prediction filter and/or the short-term prediction filter, a sample of a reconstructed audio signal is determined. The sample of the reconstructed audio signal is determined based on the residual signal generated using the neural network for the sample of the audio signal.

Type: Grant

Filed: December 10, 2019

Date of Patent: September 6, 2022

Assignee: QUALCOMM Incorporated

Inventors: Zisis Iason Skordilis, Vivek Rajendran, Guillaume Konrad Sautière, Daniel Jared Sinder
Systems and methods for managing voice queries using pronunciation information

Patent number: 11410656

Abstract: The system identifies one or more entities or content items among a plurality of stored information. The system generates an audio file based on a first text string that represents the entity or content item. Based on the first text string and at least one speech criterion, the system generating, using a speech-to-text module a second text string based on the audio file. The system then compares the text strings and stores the second text string if it is not identical to the first text string. The system generates metadata that includes results from text-speech-text conversions to forecast possible misidentifications when responding to voice queries during search operations. The metadata includes alternative representations of the entity.

Type: Grant

Filed: July 31, 2019

Date of Patent: August 9, 2022

Assignee: ROVI GUIDES, INC.

Inventors: Ankur Aher, Indranil Coomar Doss, Aashish Goyal, Aman Puniyani, Kandala Reddy, Mithun Umesh
Structured user graph to support querying and predictions

Patent number: 11397784

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving user-specific content, the user-specific content being associated with a user of one or more computer-implemented services, processing the user-specific content using one or more parsers to identify one or more entities and one or more relationships between entities, a parser being specific to a schema, and the one or more entities and the one or more relationships between entities being identified based on the schema, providing one or more user-specific knowledge graphs, a user-specific knowledge graph being specific to the user and including nodes and edges between nodes to define relationships between entities based on the schema, and storing the one or more user-specific knowledge graphs.

Type: Grant

Filed: August 14, 2019

Date of Patent: July 26, 2022

Assignee: GOOGLE LLC

Inventors: Pranav Khaitan, Shobha Diwakar
Speech processing device, teleconferencing device, speech processing system, and speech processing method

Patent number: 11398220

Abstract: A speech processing method executes at least one of first speech processing and second speech processing. The first speech processing identifies a language based on speech, performs signal processing according to the identified language, and transmits the speech on which the signal processing has been performed, to a far-end-side. The second speech processing identifies a language based on speech, receives the speech from the far-end-side, and performs signal processing on the received speech, according to the identified language.

Type: Grant

Filed: August 28, 2019

Date of Patent: July 26, 2022

Assignee: Yamaha Corporation

Inventor: Mikio Muramatsu
Complex linear projection for acoustic modeling

Patent number: 11393457

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.

Type: Grant

Filed: May 20, 2020

Date of Patent: July 19, 2022

Assignee: Google LLC

Inventors: Samuel Bengio, Mirko Visontai, Christopher Walter George Thornton, Tara N. Sainath, Ehsan Variani, Izhak Shafran, Michiel A. u. Bacchiani
Characterizing accuracy of ensemble models for automatic speech recognition by determining a predetermined number of multiple ASR engines based on their historical performance

Patent number: 11380315

Abstract: One embodiment of the present invention sets forth a technique for analyzing a transcription of a recording. The technique includes generating features representing transcriptions produced by multiple automatic speech recognition (ASR) engines from voice activity in the recording and a best transcription of the recording produced by an ensemble model from the transcriptions. The technique also includes applying a machine learning model to the features to produce a score representing an accuracy of the best transcription. The technique further includes storing the score in association with the best transcription.

Type: Grant

Filed: March 9, 2019

Date of Patent: July 5, 2022

Assignee: CISCO TECHNOLOGY, INC.

Inventors: Ahmad Abdulkader, Mohamed Gamal Mohamed Mahmoud
Residual echo suppression for keyword detection

Patent number: 11380312

Abstract: A system configured to improve wakeword detection. The system may selectively rectify (e.g., attenuate) a portion of an audio signal based on energy statistics corresponding to a keyword (e.g., wakeword). For example, a device may perform echo cancellation to generate isolated audio data, may use the energy statistics to calculate signal quality metric values for a plurality of frequency bands of the isolated audio data, and may select a fixed number of frequency bands (e.g., 5-10%) associated with lowest signal quality metric values. To detect a specific keyword, the system determines a threshold ?(f) corresponding to an expected energy value at each frequency band. During runtime, the device determines signal quality metric values by subtracting residual music from the expected energy values. Thus, the device attenuates only a portion of the total number of frequency bands that include more energy than expected based on the energy statistics of the wakeword.

Type: Grant

Filed: June 20, 2019

Date of Patent: July 5, 2022

Assignee: Amazon Technologies, Inc.

Inventor: Mohamed Mansour
Method and system for correcting infant crying identification

Patent number: 11380348

Abstract: A method for correcting infant crying identification includes the following steps: a detecting step provides an audio unit to detect a sound around an infant to generate a plurality of audio samples. A converting step provides a processing unit to convert the audio samples to generate a plurality of audio spectrograms. An extracting step provides a common model to extract the audio spectrograms to generate a plurality of infant crying features. An incremental training step provides an incremental model to train the infant crying features to generate an identification result. A judging step provides the processing unit to judge whether the identification result is correct according to a real result of the infant. When the identification result is different from the real result, an incorrect result is generated. A correcting step provides the processing unit to correct the incremental model according to the incorrect result.

Type: Grant

Filed: August 27, 2020

Date of Patent: July 5, 2022

Assignee: NATIONAL YUNLIN UNIVERSITY OF SCIENCE AND TECHNOLOGY

Inventors: Chuan-Yu Chang, Jun-Ying Li
Portable speech recognition and assistance using non-audio or distorted-audio techniques

Patent number: 11373653

Abstract: A method and system for detecting speech using close sensor applications, according to some embodiments. In some embodiments, a close microphone is applied to detect sounds with higher muscle or bone transmission components. In some embodiments, a close camera is applied that collects visual information and motion that is correlated with the potential phonemes for such positions and motion. In some embodiments, myography is performed, to detect muscle movement. In an earbud form factor embodiment, processing of different channels of close information is performed to improve the accuracy of the recognition.

Type: Grant

Filed: January 17, 2020

Date of Patent: June 28, 2022

Inventor: Joseph Alan Epstein
Utterance classifier

Patent number: 11361768

Abstract: A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.

Type: Grant

Filed: July 21, 2020

Date of Patent: June 14, 2022

Assignee: Google LLC

Inventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
Computer vision and speech algorithm design service

Patent number: 11354459

Abstract: A synthetic world interface may be used to model digital environments, sensors, and motions for the evaluation, development, and improvement of computer vision and speech algorithms. A synthetic data cloud service with a library of sensor primitives, motion generators, and environments with procedural and game-like capabilities, facilitates engineering design for a manufactural solution that has computer vision and speech capabilities. In some embodiments, a sensor platform simulator operates with a motion orchestrator, an environment orchestrator, an experiment generator, and an experiment runner to test various candidate hardware configurations and computer vision and speech algorithms in a virtual environment, advantageously speeding development and reducing cost. Thus, examples disclosed herein may relate to virtual reality (VR) or mixed reality (MR) implementations.

Type: Grant

Filed: September 21, 2018

Date of Patent: June 7, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Michael Ebstyne, Pedro Urbina Escos, Yuri Pekelny, Jonathan Chi Hang Chan, Emanuel Shalev, Alex Kipman, Mark Flick
System and method for malware detection on a per packet basis

Patent number: 11341242

Abstract: Disclosed is a computer implemented method for malware detection that analyses a file on a per packet basis. The method receives a packet of one or more packets associated a file, and converting a binary content associated with the packet into a digital representation and tokenizing plain text content associated with the packet. The method extracts one or more n-gram features, an entropy feature, and a domain feature from the converted content of the packet and applies a trained machine learning model to the one or more features extracted from the packet. The output of the machine learning method is a probability of maliciousness associated with the received packet. If the probability of maliciousness is above a threshold value, the method determines that the file associated with the received packet is malicious.

Type: Grant

Filed: October 12, 2020

Date of Patent: May 24, 2022

Assignee: Zscaler, Inc.

Inventors: Huihsin Tseng, Hao Xu, Jian L. Zhen
Method and system for broadcasting a multichannel audio stream to terminals of spectators attending a sports event

Patent number: 11343632

Abstract: The invention relates to a method for broadcasting a spatialized audio stream to terminals of spectators attending a sports event. The method comprises the acquisition of a plurality of audio streams constituting a soundscape. The soundscape is analyzed by a server in order for the sound spatialization of the audio streams and of the playback thereof on terminals, depending both on the localization of the audio flows and also the position of the spectators.

Type: Grant

Filed: September 29, 2020

Date of Patent: May 24, 2022

Assignee: INSTITUT MINES TELECOM

Inventors: Raphael Blouet, Slim Essid
Speaker recognition device, speaker recognition method, and recording medium

Patent number: 11315550

Abstract: A speaker recognition device according to the present disclosure includes: an acoustic feature calculator that calculates, from utterance data indicating a voice of an obtained utterance, acoustic feature of the voice of the utterance; a statistic calculator that calculates an utterance data statistic from the calculated acoustic feature; a speaker feature extractor that extracts speaker feature of a speaker of the utterance data from the calculated utterance data statistic using a deep neural network (DNN); a similarity calculator that calculates a similarity between the extracted speaker feature and pre-stored speaker feature of at least one registered speaker; and a speaker recognizer that recognizes the speaker of the utterance data based on the calculated similarity.

Type: Grant

Filed: November 13, 2019

Date of Patent: April 26, 2022

Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA

Inventors: Kousuke Itakura, Ko Mizuno, Misaki Doi
Method and device for training an acoustic model

Patent number: 11302303

Abstract: A method and device for training an acoustic model are provided. The method comprises determining a plurality of tasks for training an acoustic model, obtaining resource occupancies of nodes participating in the training of the acoustic model, and distributing the tasks to the nodes according to the resource occupancies of the nodes and complexities of the tasks. By using computational resources distributed at multiple nodes, tasks for training an acoustic model are performed in parallel in a distributed manner, so as to improve training efficiency.

Type: Grant

Filed: September 13, 2019

Date of Patent: April 12, 2022

Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.

Inventors: Yunfeng Li, Qingchang Hao, Yutao Gai, Chenxi Sun, Zhiping Zhou
Optimizing neuron placement in a neuromorphic system

Patent number: 11295203

Abstract: Neuron placement in a neuromorphic system to minimize cumulative delivery delay is provided. In some embodiments, a neural network description describing a plurality of neurons is read. A relative delivery delay associated with each of the plurality of neurons is determined. An ordering of the plurality of neurons is determined to optimize cumulative delivery delay over the plurality of neurons. An optimized neural network description based on the ordering of the plurality of neurons is written.

Type: Grant

Filed: July 27, 2016

Date of Patent: April 5, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rodrigo Alvarez-Icaza, Pallab Datta, Jeffrey A. Kusnitz
Text editing using speech recognition

Patent number: 11289092

Abstract: A method, system and computer program product for editing a text using speech recognition includes receiving, by a computer, a first voice input from a user comprising a first target word. The computer identifies instances of the first target word within the text and assigns a first numerical indicator to each instance of the first target word within the text. A selection is received from the user including the first numerical indicator corresponding to a starting point of a selection area. The computer receives a second voice input from the user including a second target word, identifies instances of the second target word within the text, assigns a second numerical indicator to each instance of the second target word, and receives a selection from the user including the second numerical indicator corresponding to an ending point of the selection area.

Type: Grant

Filed: September 25, 2019

Date of Patent: March 29, 2022

Assignee: International Business Machines Corporation

Inventors: JunXing Yang, XueJun Zhong, Wei Sun, ZhiXia Wang
Electronic device and a controlling method thereof

Patent number: 11282535

Abstract: Disclosed is an electronic apparatus. The electronic apparatus includes a storage for storing a plurality of filters trained in a plurality of convolutional neural networks (CNNs) respectively and a processor configured to acquire a first spectrogram corresponding to a damaged audio signal, input the first spectrogram to a CNN corresponding to each frequency band to apply the plurality of filters trained in the plurality of CNNs respectively, acquire a second spectrogram by merging output values of the CNNs to which the plurality of filters are applied, and acquire an audio signal reconstructed based on the second spectrogram.

Type: Grant

Filed: July 19, 2018

Date of Patent: March 22, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ki-Hyun Choo, Anton Porov, Jong-Hoon Jeong, Ho-Sang Sung, Eun-Mi Oh, Jong-Youb Ryu
Neural network based acoustic models for speech recognition by grouping context-dependent targets

Patent number: 11263516

Abstract: Methods and systems for training a neural network include identifying weights in a neural network between a final hidden neuron layer and an output neuron layer that correspond to state matches between a neuron of the final hidden neuron layer and a respective neuron of the output neuron layer. The identified weights are initialized to a predetermined non-zero value and initializing other weights between the final hidden neuron layer and the output neuron layer to zero. The neural network is trained based on a training corpus after initialization.

Type: Grant

Filed: August 2, 2016

Date of Patent: March 1, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Gakuto Kurata
Neural network with embedded filter layer

Patent number: 11256973

Abstract: A neural network embodiment comprises an input layer, an output layer and a filter layer. Each unit of the filter layer receives a filter layer input from a single preceding unit via a respective filter layer input connection. Each filter layer input connection is coupled to a different single preceding unit. The filter layer incentivizes the neural network to learn to produce a target output from the output layer for a given input to the input layer while simultaneously learning weights for each filter layer input connection. The weights learned cause the filter layer to reduce a number of filter layer units that pass respective filter layer inputs as non-zero values. When applied as an initial internal layer between an input layer and an output layer, the filter layer incentivizes the neural network to learn which neural network input features to discard to produce the target output.

Type: Grant

Filed: February 5, 2018

Date of Patent: February 22, 2022

Assignee: Nuance Communications, Inc.

Inventors: Nasr Madi, Neil D. Barrett
On-device neural network adaptation with binary mask learning for language understanding systems

Patent number: 11257483

Abstract: Spoken language understanding techniques include training a dynamic neural network mask relative to a static neural network using only post-deployment training data such that the mask zeroes out some of the weights of the static neural network and allows some other weights to pass through and applying a dynamic neural network corresponding to the masked static neural network to input queries to identify outputs for the queries.

Type: Grant

Filed: March 29, 2019

Date of Patent: February 22, 2022

Assignee: Intel Corporation

Inventors: Krzysztof Czarnowski, Munir Georges
Voiceprint security with messaging services

Patent number: 11252152

Abstract: An online system authenticates a user through a voiceprint biometric verification process. When a user needs to be authenticated, the online system generates and provides a random phrase to the user. The online system receives an audio recording of the randomly generated phrase and retrieves a previously trained voiceprint model for the user. The online system analyzes the audio recording by applying the voiceprint model to determine whether the audio recording satisfies a first criteria of whether the voice in the audio recording belongs the user and a second criteria of whether the audio recording includes a vocalization of the randomly generated phrase. If the audio recording satisfies both criteria, the online system authenticates the user. Therefore, the user can be provided access to a new communication session in response to being authenticated.

Type: Grant

Filed: June 3, 2020

Date of Patent: February 15, 2022

Assignee: salesforce.com, inc.

Inventor: Eugene Lew
Device and method for generating speech animation

Patent number: 11244668

Abstract: A method for generating speech animation from an audio signal includes: receiving the audio signal; transforming the received audio signal into frequency-domain audio features; performing neural-network processing on the frequency-domain audio features to recognize phonemes, wherein the neural-network processing is performed using a neural network trained with a phoneme dataset comprising of audio signals with corresponding ground-truth phoneme labels; and generating the speech animation from the recognized phonemes.

Type: Grant

Filed: May 29, 2020

Date of Patent: February 8, 2022

Assignee: TCL RESEARCH AMERICA INC.

Inventors: Zixiao Yu, Haohong Wang
Model training method and apparatus

Patent number: 11244671

Abstract: A model training method and apparatus is disclosed, where the model training method acquires first output data of a student model for first input data and second output data of a teacher model for second input data and trains the student model such that the first output data and the second output data are not distinguished from each other. The student model and the teacher model have different structures.

Type: Grant

Filed: August 23, 2019

Date of Patent: February 8, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hogyeong Kim, Hyohyeong Kang, Hwidong Na, Hoshik Lee
Method and device for processing an electronic document

Patent number: 11232141

Abstract: A method for processing an electronic document comprising text is disclosed. The method comprises: splitting the text into at least one sentence, and for each said sentence: associating each word of the sentence with a word-vector; representing the sentence by a sentence-vector, wherein obtaining the sentence-vector comprises computing a weighted average of all word-vectors associated with the sentence; if it is determined that the sentence-vector is associated with a tag in a data set of sentence-vectors associated with tags, obtaining the tag from the database; otherwise, obtaining a tag for the sentence-vector using a classification algorithm; processing the sentence if the tag obtained for the sentence is associated with a predetermined label.

Type: Grant

Filed: September 13, 2018

Date of Patent: January 25, 2022

Inventors: Youness Mansar, Sira Ferradans, Jacopo Staiano
Determining hotword suitability

Patent number: 11227611

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.

Type: Grant

Filed: June 3, 2020

Date of Patent: January 18, 2022

Assignee: Google LLC

Inventors: Andrew E. Rubin, Johan Schalkwyk, Maria Carolin Parada San Martin
Audio response messages

Patent number: 11227626

Abstract: An audio response system can generate multimodal messages that can be dynamically updated on viewer's client device based on a type of audio response detected. The audio responses can include keywords or continuum-based signal (e.g., levels of wind noise). A machine learning scheme can be trained to output classification data from the audio response data for content selection and dynamic display updates.

Type: Grant

Filed: May 21, 2019

Date of Patent: January 18, 2022

Assignee: Snap Inc.

Inventors: Gurunandan Krishnan Gorumkonda, Shree K. Nayar
Speaker recognition device, speaker recognition method, and recording medium

Patent number: 11222641

Abstract: A speaker recognition device includes: a feature calculator that calculates two or more acoustic features of a voice of an utterance obtained; a similarity calculator that calculates two or more similarities, each being a similarity between one of one or more speaker-specific features of a target speaker for recognition and one of the two or more acoustic features; a combination unit that combines the two or more similarities to obtain a combined value; and a determiner that determines whether a speaker of the utterance is the target speaker based on the combined value. Here, (i) at least two of the two or more acoustic features have different properties, (ii) at least two of the two or more similarities have different properties, or (iii) at least two of the two or more acoustic features have different properties and at least two of the two or more similarities have different properties.

Type: Grant

Filed: September 19, 2019

Date of Patent: January 11, 2022

Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA

Inventor: Kousuke Itakura
Systems and methods for improved speech recognition using neuromuscular information

Patent number: 11216069

Abstract: Systems and methods for using neuromuscular information to improve speech recognition. The system includes a plurality of neuromuscular sensors arranged on one or more wearable devices and configured to continuously record a plurality of neuromuscular signals from a user, at least one storage device configured to store one or more trained statistical models for determining text based on audio input and the plurality of neuromuscular signals, at least one input interface configured to receive the audio input, and at least one computer processor programmed to obtain the audio input and the plurality of neuromuscular signals, provide as input to the one or more trained statistical models, the audio input and the plurality of neuromuscular signals or signals derived from the plurality of neuromuscular signals, and determine based, at least in part, on an output of the one or more trained statistical models, the text.

Type: Grant

Filed: May 8, 2018

Date of Patent: January 4, 2022

Assignee: Facebook Technologies, LLC

Inventors: Adam Berenzweig, Patrick Kaifosh, Alan Huan Du, Jeffrey Scott Seely
Method and apparatus for speech recognition, and electronic device

Patent number: 11217229

Abstract: A speech recognition method, apparatus, a computer device and an electronic device for recognizing speech. The method includes receiving an audio signal obtained by a microphone array; performing a beamforming processing on the audio signal in a plurality of target directions to obtain a plurality of beam signals; performing a speech recognition on each of the plurality of beam signals to obtain a plurality of speech recognition results corresponding to the plurality of beam signals; and determining a speech recognition result of the audio signal based on the plurality of speech recognition results of the plurality of beam signals.

Type: Grant

Filed: July 6, 2020

Date of Patent: January 4, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LTD

Inventors: Yi Gao, Ji Meng Zheng, Meng Yu, Min Luo
Speech processing using a recurrent neural network

Patent number: 11205420

Abstract: A system and method performs wakeword detection using a neural network model that includes a recurrent neural network (RNN) for processing variable-length wakewords. To prevent the model from being influenced by non-wakeword speech, multiple instances of the model are created to process audio data, and each instance is configured to use weights determined by training data. The model may instead or in addition be used to process the audio data only when a likelihood that the audio data corresponds to the wakeword is greater than a threshold. The model may process the audio data as represented by groups of acoustic feature vectors; computations for feature vectors common to different groups may be re-used.

Type: Grant

Filed: June 10, 2019

Date of Patent: December 21, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Gengshen Fu, Thibaud Senechal, Shiv Naga Prasad Vitaladevuni, Michael J. Rodehorst, Varun K. Nagaraja
Processing text sequences using neural networks

Patent number: 11182566

Abstract: A computer-implemented method for training a neural network that is configured to generate a score distribution over a set of multiple output positions. The neural network is configured to process a network input to generate a respective score distribution for each of a plurality of output positions including a respective score for each token in a predetermined set of tokens that includes n-grams of multiple different sizes. Example methods described herein provide trained neural networks which produce results with improved accuracy compared to the state of the art, e.g. translations that are more accurate compared to the state of the art, or more accurate speech recognition compared to the state of the art.

Type: Grant

Filed: October 3, 2017

Date of Patent: November 23, 2021

Assignee: Google LLC

Inventors: Navdeep Jaitly, Yu Zhang, Quoc V. Le, William Chan
Speech recognition apparatus and method with acoustic modelling

Patent number: 11176926

Abstract: Provided is a speech recognition apparatus. The apparatus includes a preprocessor configured to extract select frames from all frames of a first speech of a user, and a score calculator configured to calculate an acoustic score of a second speech, made up of the extracted select frames, by using a Deep Neural Network (DNN)-based acoustic model, and to calculate an acoustic score of frames, of the first speech, other than the select frames based on the calculated acoustic score of the second speech.

Type: Grant

Filed: February 20, 2020

Date of Patent: November 16, 2021

Assignee: Samsung Electronics Co., Ltd.

Inventor: In Chul Song
Trajectory prediction on top-down scenes

Patent number: 11169531

Abstract: Techniques are discussed for determining predicted trajectories based on a top-down representation of an environment. Sensors of a first vehicle can capture sensor data of an environment, which may include agent(s) separate from the first vehicle, such as a second vehicle or a pedestrian. A multi-channel image representing a top-down view of the agent(s) and the environment and comprising semantic information can be generated based on the sensor data. Semantic information may include a bounding box and velocity information associated with the agent, map data, and other semantic information. Multiple images can be generated representing the environment over time. The image(s) can be input into a prediction system configured to output a heat map comprising prediction probabilities associated with possible locations of the agent in the future. A predicted trajectory can be generated based on the prediction probabilities and output to control an operation of the first vehicle.

Type: Grant

Filed: October 4, 2018

Date of Patent: November 9, 2021

Assignee: Zoox, Inc.

Inventors: Xi Joey Hong, Benjamin John Sapp
Multi-agent input coordination

Patent number: 11170783

Abstract: Multi-agent input coordination can be used to for acoustic collaboration of multiple listening agents deployed in smart devices on a premises, improving the accuracy of identifying requests and specifying where that request should be honored, improving quality of detection, and providing better understanding of user commands and user intent throughout the premises. A processor or processors such as those in a smart speaker can identify audio requests received through at least two agents in a network and determine at which of the agents to actively process a selected audio request. The identification can make use of techniques such as location context and secondary trait analysis. The audio request can include simultaneous audio requests received through at least two agents, differing audio requests received from different requesters, or both.

Type: Grant

Filed: April 16, 2019

Date of Patent: November 9, 2021

Assignee: AT&T Intellectual Property I, L.P.

Inventors: James Pratt, Timothy Innes, Eric Zavesky, Nigel Bradley
Conversational agent generation

Patent number: 11164574

Abstract: One embodiment provides a method, including: obtaining a plurality of conversational logs; generating a human agent emulator and a user emulator; providing a workspace for a conversational agent, so that an agent designer generates a conversational specification for the conversational agent, wherein the generating a conversational specification comprises: receiving a selection, by the agent designer, of at least one intent for the conversational agent, wherein the receiving a selection is responsive to the conversational agent workspace providing suggestions for intents; providing at least one suggestion for a dialog node that corresponds to the selected at least one intent; and generating a dialog flow for the conversational agent, wherein the generating comprises iteratively receiving, from the agent designer, selection of at least one aspect and receiving at least one selection of the at least one suggestion for dialog nodes; and providing the conversational agent.

Type: Grant

Filed: January 3, 2019

Date of Patent: November 2, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Pankaj Dhoolia, Ajay Kumar Gupta, Danish Contractor, Dinesh Raghu, Sachindra Joshi, Vineet Kumar, Dhiraj Madan
Graph partitioning and placement for multi-chip neurosynaptic networks

Patent number: 11157795

Abstract: Graph partitioning and placement for multi-chip neurosynaptic networks. According to various embodiments, a neural network description is read. The neural network description describes a plurality of neurons. The plurality of neurons has a mapping from an input domain of the neural network. The plurality of neurons is labeled based on the mapping from the input domain. The plurality of neurons is grouped into a plurality of groups according to the labeling. Each of the plurality of groups is continuous within the input domain. Each of the plurality of groups is assigned to at least one neurosynaptic core.

Type: Grant

Filed: March 13, 2017

Date of Patent: October 26, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Arnon Amir, Pallab Datta, Myron D. Flickner, Dharmendra S. Modha, Tapan K. Nayak

prev 1 2 3 4 5 6 … next