Neural Network Patents (Class 704/232)

Electronic doorbell system with text communication

Patent number: 11145171

Abstract: An electronic doorbell system can be configured to enable remote audio communications between a visitor at the doorbell and a user of a mobile computing device by exchanging speech-to-text and/or text-to-speech messages in real time. Audio captured by the visitor can be transcribed into text messages and sent to the user of the mobile device using a speech-to-text service. The user of the mobile device can send text messages to the doorbell for playback to the visitor by using a text-to-speech service. The system can also use artificial intelligence to detect the language spoken by the visitor for translating between a predetermined language of the user and the language of the visitor. The system can also include a camera for capturing video of the visitor for display to the mobile device simultaneous with exchanging text messages between the visitor and the user, such as during a live Session Initiation Protocol (SIP) communication.

Type: Grant

Filed: February 28, 2019

Date of Patent: October 12, 2021

Assignee: Arlo Technologies, Inc.

Inventors: Rajinder Singh, Justin Maggard, Dnyanesh Patil, Dennis Aldover, Nisheeth Gupta, Subramanian Ramamoorthy
Method and apparatus for determining semantic matching degree

Patent number: 11138385

Abstract: A method and an apparatus for determining a semantic matching degree, where the method includes acquiring a first sentence and a second sentence, dividing the first sentence and the second sentence into x and y sentence fragments, respectively, performing a convolution operation on word vectors in each sentence fragment of the first sentence and word vectors in each sentence fragment of the second sentence to obtain a three-dimensional tensor, performing integration or screening on adjacent vectors in the one-dimensional vectors of x rows and y columns, until the three-dimensional tensor is combined into a one-dimensional target vector, and determining a semantic matching degree between the first sentence and the second sentence according to the target vector.

Type: Grant

Filed: November 1, 2019

Date of Patent: October 5, 2021

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Zhengdong Lu, Hang Li
Artificial intelligence server for setting language of robot and method for the same

Patent number: 11132995

Abstract: An artificial intelligence server for setting a language of a robot includes a communication unit and a processor. The communication unit is configured to communicate with the robot. The processor is configured to receive voice data for a control area from the robot, generate first language distribution information using the received voice data, receive event information for the control area, generate second language distribution information using the received event information, determine at least one major output language for the robot based on the generated first language distribution information and the generated second language distribution information, and transmit a control signal for setting the determined major output language to the robot.

Type: Grant

Filed: September 4, 2019

Date of Patent: September 28, 2021

Assignee: LG ELECTRONICS INC.

Inventors: Jonghoon Chae, Taehyun Kim
Method and apparatus for analysing an image

Patent number: 11126894

Abstract: The present embodiments relate to analysing an image. An artificial deep neural net is pre-trained to classify images into a hierarchical system of multiple hierarchical classes. The pre-trained neural net is then adapted for one specific class, wherein the specific class is lower in the hierarchical system than an actual class of the image. The image is then processed by a forward pass through the adapted neural net to generate a processing result. An image processing algorithm is then used to analyse the processing result focused on features corresponding to the specific class.

Type: Grant

Filed: June 4, 2018

Date of Patent: September 21, 2021

Assignee: Siemens Aktiengesellschaft

Inventors: Sanjukta Ghosh, Peter Amon, Andreas Hutter
Robust training of large-scale object detectors with a noisy dataset

Patent number: 11126890

Abstract: Systems and methods are described for object detection within a digital image using a hierarchical softmax function. The method may include applying a first softmax function of a softmax hierarchy on a digital image based on a first set of object classes that are children of a root node of a class hierarchy, then apply a second (and subsequent) softmax functions to the digital image based on a second (and subsequent) set of object classes, where the second (and subsequent) object classes are children nodes of an object class from the first (or parent) object classes. The methods may then include generating an object recognition output using a convolutional neural network (CNN) based at least in part on applying the first and second (and subsequent) softmax functions. In some cases, the hierarchical softmax function is the loss function for the CNN.

Type: Grant

Filed: April 18, 2019

Date of Patent: September 21, 2021

Assignee: ADOBE INC.

Inventors: Zhe Lin, Mingyang Ling, Jianming Zhang, Jason Kuen, Federico Perazzi, Brett Butterfield, Baldo Faieta
Pre-training of neural network by parameter decomposition

Patent number: 11106974

Abstract: A technique for training a neural network including an input layer, one or more hidden layers and an output layer, in which the trained neural network can be used to perform a task such as speech recognition. In the technique, a base of the neural network having at least a pre-trained hidden layer is prepared. A parameter set associated with one pre-trained hidden layer in the neural network is decomposed into a plurality of new parameter sets. The number of hidden layers in the neural network is increased by using the plurality of the new parameter sets. Pre-training for the neural network is performed.

Type: Grant

Filed: July 5, 2017

Date of Patent: August 31, 2021

Assignee: International Business Machines Corporation

Inventors: Takashi Fukuda, Osamu Ichikawa
Artificial intelligence (AI)-based voice sampling apparatus and method for providing speech style

Patent number: 11107456

Abstract: Discussed is an artificial intelligence (AI)-based voice sampling apparatus for providing a speech style, including a rhyme encoder configured to receive a user's voice, extract a voice sample, and analyze a vocal feature included in the voice sample, a text encoder configured to receive text for reflecting the vocal feature, a processor configured to classify the vocal feature of the voice sample input to the rhyme encoder according to a label, extract an embedding vector representing the vocal feature from the label, and generate a speech style from the embedding vector and apply the generated speech style to the text, and a rhyme decoder configured to output synthesized voice data in which the speech style is applied to the text by the processor.

Type: Grant

Filed: September 5, 2019

Date of Patent: August 31, 2021

Assignee: LG ELECTRONICS INC.

Inventors: Jonghoon Chae, Minook Kim, Sangki Kim, Yongchul Park, Siyoung Yang, Juyeong Jang, Sungmin Han
Systems and methods for speech signal processing to transcribe speech

Patent number: 11107482

Abstract: The present disclosure relates to systems and methods for speech signal processing on a signal to transcribe speech. In one implementation, the system may include a memory storing instructions and a processor configured to execute the instructions. The instructions may include instructions to receive the signal, determine if at least a portion of data in the signal is missing, and when at least a portion of data is missing: process the signal using a hidden Markov model to generate an output; using the output, calculate a set of possible contents to fill a gap due to the missing data portion, with each possible content having an associated probability; based on the associated probabilities, select one of the set of possible contents; and using the selected possible content, update the signal.

Type: Grant

Filed: December 5, 2019

Date of Patent: August 31, 2021

Assignee: RingCentral, Inc.

Inventors: Xiaoming Li, Ehtesham Khan, Santosh Panattu Sethumadhavan
Classifying objects using recurrent neural network and classifier neural network subsystems

Patent number: 11093819

Abstract: Disclosed herein are neural networks for generating target classifications for an object from a set of input sequences. Each input sequence includes a respective input at each of multiple time steps, and each input sequence corresponds to a different sensing subsystem of multiple sensing subsystems. For each time step in the multiple time steps and for each input sequence in the set of input sequences, a respective feature representation is generated for the input sequence by processing the respective input from the input sequence at the time step using a respective encoder recurrent neural network (RNN) subsystem for the sensing subsystem that corresponds to the input sequence. For each time step in at least a subset of the multiple time steps, the respective feature representations are processed using a classification neural network subsystem to select a respective target classification for the object at the time step.

Type: Grant

Filed: December 16, 2016

Date of Patent: August 17, 2021

Assignee: Waymo LLC

Inventors: Congcong Li, Ury Zhilinsky, Yun Jiang, Zhaoyin Jia
Neural network device for speaker recognition, and method of operation thereof

Patent number: 11094329

Abstract: Provided are a neural network device and a method of operation thereof. The neural network device for speaker recognition may include: a memory configured to store one or more instructions; and a processor configured to generate a trained second neural network by training a first neural network, for separating a mixed voice signal into individual voice signals by executing the one or more instructions, generate a second neural network by adding at least one layer to the trained first neural network, and generate a trained second neural network by training the second neural network, for separating the mixed voice signal into the individual voice signals and for recognizing a speaker of each of the individual voice signals.

Type: Grant

Filed: September 18, 2018

Date of Patent: August 17, 2021

Assignees: SAMSUNG ELECTRONICS CO., LTD., SEOUL NATIONAL UNIVERSITY R&DB FOUNDATION

Inventors: Sangha Park, Namsoo Kim, Hyungyong Kim, Sungchan Kang, Cheheung Kim, Yongseop Yoon, Choongho Rhee, Hyeokki Hong
Multichannel speech recognition using neural networks

Patent number: 11062725

Abstract: This specification describes computer-implemented methods and systems. One method includes receiving, by a neural network of a speech recognition system, first data representing a first raw audio signal and second data representing a second raw audio signal. The first raw audio signal and the second raw audio signal describe audio occurring at a same period of time. The method further includes generating, by a spatial filtering layer of the neural network, a spatial filtered output using the first data and the second data, and generating, by a spectral filtering layer of the neural network, a spectral filtered output using the spatial filtered output. Generating the spectral filtered output comprises processing frequency-domain data representing the spatial filtered output. The method still further includes processing, by one or more additional layers of the neural network, the spectral filtered output to predict sub-word units encoded in both the first raw audio signal and the second raw audio signal.

Type: Grant

Filed: February 19, 2019

Date of Patent: July 13, 2021

Assignee: Google LLC

Inventors: Ehsan Variani, Kevin William Wilson, Ron J. Weiss, Tara N. Sainath, Arun Narayanan
Methods, devices and systems for managing network video traffic

Patent number: 11049005

Abstract: Aspects of the subject disclosure may include, for example, embodiments provisioning a neural network comprising a plurality of layers. Further embodiments include provisioning a plurality of Markov logic state machines among the plurality of layers of the neural network resulting in a machine learning application. Additional embodiments include training the machine learning application using historical network video traffic resulting in a trained machine learning application. Also, embodiments include receiving current network video traffic. Embodiments include provisioning network resources to route the current network video traffic according to the trained machine learning application. Other embodiments are disclosed.

Type: Grant

Filed: March 22, 2017

Date of Patent: June 29, 2021

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Raghuraman Gopalan
Method and system for controlling machines based on object recognition

Patent number: 11048976

Abstract: A method includes: capturing one or more images of an unorganized collection of items inside a first machine; determining one or more item types of the unorganized collection of items from the one or more images, comprising: dividing a respective image in the one or more images into a respective plurality of sub-regions; performing feature detection on the respective plurality of sub-regions to obtain a respective plurality of regional feature vectors, wherein a regional feature vector for a sub-region indicates characteristics for a plurality of predefined local item features for the sub-region; generating an integrated feature vector by combining the respective plurality of regional feature vectors; and applying a plurality of binary classifiers to the integrated feature vector; and selecting a machine setting for the first machine based on the determined one or more clothes type in the unorganized collection of items.

Type: Grant

Filed: November 11, 2019

Date of Patent: June 29, 2021

Assignee: MIDEA GROUP CO., LTD.

Inventors: Yunke Tian, Thanh Huy Ha, Zhicai Ou
System and method for personalized speaker verification

Patent number: 11031018

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for personalized speaker verification are provided. One of the methods includes: obtaining first speech data of a speaker as a positive sample and second speech data of an entity different from the speaker as a negative sample; feeding the positive sample and the negative sample to a first model for determining voice characteristics to correspondingly output a positive voice characteristic and a negative voice characteristic of the speaker; obtaining a gradient based at least on the positive voice characteristic and the negative voice characteristic; and feeding the gradient to the first model to update one or more parameters of the first model to obtain a second model for personalized speaker verification.

Type: Grant

Filed: December 22, 2020

Date of Patent: June 8, 2021

Assignee: ALIPAY (HANGZHOU) INFORMATION TECHNOLOGY CO., LTD.

Inventors: Zhiming Wang, Kaisheng Yao, Xiaolong Li
Multi-level convolutional LSTM model for the segmentation of MR images

Patent number: 11030750

Abstract: Approaches for the automatic segmentation of magnetic resonance (MR) images. Machine learning models segment images to identify image features in consecutive frames at different levels of resolution. A neural network block is applied to groups of MR images to produce primary feature maps at two or more levels of resolution. The images in a given group of MR images may correspond to a cycle and have a temporal order. A second RNN block is applied to the primary feature maps to produce two or more output tensors at corresponding levels of resolution. A segmentation block is applied to the two or more output tensors to produce a probability map for the MR images. The first neural network block may be a convolutional neural network (CNN) block. The second neural network block may be a convolutional long short-term (LSTM) block.

Type: Grant

Filed: May 30, 2019

Date of Patent: June 8, 2021

Assignees: Merck Sharp & Dohme Corp., MSD International GmbH

Inventors: Antong Chen, Dongqing Zhang, Ilknur Icke, Belma Dogdas, Sarayu Parimal
Electronic device and control method therefor

Patent number: 11024300

Abstract: Provided are an electronic device and a control method. The electronic device comprises: a storage unit for storing a user-based dictionary; an input unit for receiving an input sentence including a user-specific word and at least one word learned by a neural network-based language model; and a processor for determining a concept category of the user-specific word on the basis of semantic information of the input sentence, adding the user-specific word to the user-based dictionary to perform update, and when text corresponding to semantic information of the at least one learned word is input, providing the user-specific word as an autocomplete recommendation word which can be input subsequent to the text.

Type: Grant

Filed: March 9, 2017

Date of Patent: June 1, 2021

Assignee: Samsung Electronics Co., Ltd.

Inventors: Dae-Hyun Ban, Woo-Jin Park, Kyu-Haeng Lee, Sang-Soon Lim, Seong-Won Han
Avoiding wake word self-triggering

Patent number: 11004453

Abstract: Techniques for avoiding wake word self-triggering are provided. In one embodiment, an electronic device can receive an audio-out signal to be output as audio via a speaker of the device and can attempt to recognize a wake word in the audio-out signal using a first recognizer. If the wake word is recognized in the audio-out signal, the electronic device can further determine whether a wake word match is made using a second recognizer with respect to a mic-in audio signal captured via a microphone of the device at approximately the same time that the audio-out signal is output via the speaker. If so, the electronic device can ignore the wake word match made using the second recognizer.

Type: Grant

Filed: April 4, 2018

Date of Patent: May 11, 2021

Assignee: Sensory, Incorporated

Inventor: Erich Adams
Real-time vocal features extraction for automated emotional or mental state assessment

Patent number: 11004461

Abstract: Embodiments of the present systems and methods may provide techniques for extracting vocal features from voice signals to determine an emotional or mental state of one or more persons, such as to determine a risk of suicide and other mental health issues. For example, as a person's mental state may indirectly alters his or her speech, suicidal risk in, for example, hotline calls, may be determined through speech analysis. In embodiments, such techniques may include preprocessing of the original recording, vocal feature extraction, and prediction processing. For example, in an embodiment, a computer-implemented method of determining an emotional or mental state of a person, the method comprising acquiring an audio signal relating to a conversation including the person, extracting signal components relating to an emotional or mental state of at least the person, and outputting information characterizing the extracted emotional or mental state of the person.

Type: Grant

Filed: August 20, 2018

Date of Patent: May 11, 2021

Inventor: Newton Howard
Efficient large-scale kernel learning using a distributed processing architecture

Patent number: 10997525

Abstract: A method and system of creating a model for large scale data analytics is provided. Training data is received in a form of a data matrix X and partitioned into a plurality of partitions. A random matrix T is generated. A feature matrix is determined based on multiplying the partitioned training data by the random matrix T. A predicted data {tilde over (y)} is determined for each partition via a stochastic average gradient (SAG) of each partition. A number of SAG values is reduced based on a number of rows n in the data matrix X. For each iteration, a sum of the reduced SAG values is determined, as well as a full gradient based on the sum of the reduced SAG values from all rows n, by distributed parallel processing. The model parameters w are updated based on the full gradient for each partition.

Type: Grant

Filed: November 20, 2017

Date of Patent: May 4, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Shen Li, Xiang Ni, Michael John Witbrock, Lingfei Wu
Voice recognition method, device and computer storage medium

Patent number: 10997966

Abstract: Disclosed are a speech recognition method, a device and a computer readable storage medium. The speech recognition method comprises: performing acoustic characteristic extraction on an input voice, so as to obtain an acoustic characteristic (S11); acquiring an acoustic model, a parameter of the acoustic model being a binarization parameter (S12); and according to the acoustic characteristic and the acoustic model, performing speech recognition (S13).

Type: Grant

Filed: January 25, 2017

Date of Patent: May 4, 2021

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Ke Ding, Bing Jiang, Xiangang Li
Voice conversation method and system with enhanced word features

Patent number: 10984785

Abstract: Provided is a voice conversation method using a bi-directional LSTM (Bi-LSTM) memory network.

Type: Grant

Filed: June 20, 2019

Date of Patent: April 20, 2021

Assignee: SOGANG UNIVERSITY RESEARCH FOUNDATION

Inventors: Myoung-Wan Koo, Byoungjae Kim, Jung Yun Seo
Technologies for optimized machine learning training

Patent number: 10963783

Abstract: Technologies for optimization of machine learning training include a computing device to train a machine learning network with a training algorithm that is configured with configuration parameters. The computing device may perform many training instances in parallel. The computing device captures a time series of partial accuracy values from the training. Each partial accuracy value is indicative of machine learning network accuracy at an associated training iteration. The computing device inputs the configuration parameters to a feed-forward neural network to generate a representation and inputs the representation to a recurrent neural network. The computing device trains the feed-forward neural network and the recurrent neural network against the partial accuracy values. The computing device optimizes the feed-forward neural network and the recurrent neural network to determine optimized configuration parameters.

Type: Grant

Filed: February 19, 2017

Date of Patent: March 30, 2021

Assignee: INTEL CORPORATION

Inventors: Lev Faivishevsky, Amitai Armon
Noise cancellation

Patent number: 10957342

Abstract: An audio processing apparatus, comprising: a first receiver configured to receive one or more audio signals derived from one or more microphones, the one or more audio signals comprising a speech component received from a user and a first noise component transmitted by a first device; a second receiver configured to receive over a network and from the first device, first audio data corresponding to the first noise component; one or more processors configured to: remove the first noise component from the one or more audio signals using the first audio data to generate a first processed audio signal; and perform speech recognition on the first processed audio signal to generate a first speech result.

Type: Grant

Filed: January 16, 2019

Date of Patent: March 23, 2021

Assignee: Cirrus Logic, Inc.

Inventors: Kieran Reed, Krishna Kongara, Aengus Westhead, Hock Lim
Neural network generative modeling to transform speech utterances and augment training data

Patent number: 10937438

Abstract: Systems, methods, and devices for speech transformation and generating synthetic speech using deep generative models are disclosed. A method of the disclosure includes receiving input audio data comprising a plurality of iterations of a speech utterance from a plurality of speakers. The method includes generating an input spectrogram based on the input audio data and transmitting the input spectrogram to a neural network configured to generate an output spectrogram. The method includes receiving the output spectrogram from the neural network and, based on the output spectrogram, generating synthetic audio data comprising the speech utterance.

Type: Grant

Filed: March 29, 2018

Date of Patent: March 2, 2021

Assignee: FORD GLOBAL TECHNOLOGIES, LLC

Inventors: Praveen Narayanan, Lisa Scaria, Francois Charette, Ashley Elizabeth Micks, Ryan Burke
Method, device, and computer program product for recognizing reducible contents in data to be written

Patent number: 10936227

Abstract: Techniques recognize reducible contents in data to be written. The techniques involve receiving information related to data to be written, the information indicating that the data to be written comprises reducible contents, the reducible contents comprising data with a first reduction pattern. The techniques further involve recognizing the reducible contents in the data to be written based on the information. The techniques further involve reducing the reducible contents based on the first reduction pattern. With such techniques, active I/O pattern recognition with communication between applications and storage devices may be accomplished. In addition, with such techniques, it is easy/simple to expand recognizable new patterns, and I/O pattern limitations in standard approaches no longer exist.

Type: Grant

Filed: February 11, 2019

Date of Patent: March 2, 2021

Assignee: EMC IP Holding Company LLC

Inventors: Lifeng Yang, Xinlei Xu, Xiongcheng Li, Changyu Feng, Ruiyong Jia
Unified endpointer using multitask and multidomain learning

Patent number: 10929754

Abstract: A method for training an endpointer model includes short-form speech utterances and long-form speech utterances. The method also includes providing a short-form speech utterance as input to a shared neural network, the shared neural network configured to learn shared hidden representations suitable for both voice activity detection (VAD) and end-of-query (EOQ) detection. The method also includes generating, using a VAD classifier, a sequence of predicted VAD labels and determining a VAD loss by comparing the sequence of predicted VAD labels to a corresponding sequence of reference VAD labels. The method also includes, generating, using an EOQ classifier, a sequence of predicted EOQ labels and determining an EOQ loss by comparing the sequence of predicted EOQ labels to a corresponding sequence of reference EOQ labels. The method also includes training, using a cross-entropy criterion, the endpointer model based on the VAD loss and the EOQ loss.

Type: Grant

Filed: December 11, 2019

Date of Patent: February 23, 2021

Assignee: Google LLC

Inventors: Shuo-yiin Chang, Bo Li, Gabor Simko, Maria Carolina Parada San Martin, Sean Matthew Shannon
Systems and methods for speech-based searching of content repositories

Patent number: 10922322

Abstract: According to some aspects, a method of searching for content in response to a user voice query is provided. The method may comprise receiving the user voice query, performing speech recognition to generate N best speech recognition results comprising a first speech recognition result, performing a supervised search of at least one content repository to identify one or more supervised search results using one or more classifiers that classify the first speech recognition result into at least one class that identifies previously classified content in the at least one content repository, performing an unsupervised search of the at least one content repository to identify one or more unsupervised search results, wherein performing the unsupervised search comprises performing a word search of the at least one content repository, and generating combined results from among the one or more supervised search results and the one or more unsupervised search results.

Type: Grant

Filed: July 22, 2014

Date of Patent: February 16, 2021

Assignee: Nuance Communications, Inc.

Inventors: Jan Kleindienst, Ladislav Kunc, Martin Labsky, Tomas Macek
Asynchronous optimization for sequence training of neural networks

Patent number: 10916238

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

Type: Grant

Filed: April 30, 2020

Date of Patent: February 9, 2021

Assignee: Google LLC

Inventors: Georg Heigold, Erik Mcdermott, Vincent O. Vanhoucke, Andrew W. Senior, Michiel A. U. Bacchiani
Speech recognition device and computer program

Patent number: 10909976

Abstract: A speech recognition device includes: an acoustic model based on an End-to-End neural network responsive to an observed sequence formed of prescribed acoustic features obtained from a speech signal by feature extracting unit, for calculating probability of the observed sequence being a certain symbol sequence; and a decoder responsive to a symbol sequence candidate, for decoding a speech signal by a WFST based on a posterior probability of each of word sequences corresponding to the symbol sequence candidate, probabilities calculated by the acoustic model for symbol sequences selected based on an observed sequence, and a posterior probability of each of the plurality of symbol sequences.

Type: Grant

Filed: June 2, 2017

Date of Patent: February 2, 2021

Assignee: National Institute of Information and Communications Technology

Inventor: Naoyuki Kanda
System and method for relationship identification

Patent number: 10902350

Abstract: The present teaching relates to method, system, and medium for generating training data for generating a relationship identification model. Sentences are received as input. Each of the sentences is aligned with a fact previously stored to create an alignment. Confidence scores for the alignments are computed and then used, together with the alignments to train a relationship identification model.

Type: Grant

Filed: July 20, 2018

Date of Patent: January 26, 2021

Assignee: Verizon Media Inc.

Inventors: Siddhartha Banerjee, Kostas Tsioutsiouliklis
System and methods for adapting neural network acoustic models

Patent number: 10902845

Abstract: Techniques for adapting a trained neural network acoustic model, comprising using at least one computer hardware processor to perform: generating initial speaker information values for a speaker; generating first speech content values from first speech data corresponding to a first utterance spoken by the speaker; processing the first speech content values and the initial speaker information values using the trained neural network acoustic model; recognizing, using automatic speech recognition, the first utterance based, at least in part on results of the processing; generating updated speaker information values using the first speech data and at least one of the initial speaker information values and/or information used to generate the initial speaker information values; and recognizing, based at least in part on the updated speaker information values, a second utterance spoken by the speaker.

Type: Grant

Filed: July 1, 2019

Date of Patent: January 26, 2021

Assignee: Nuance Communications, Inc.

Inventors: Puming Zhan, Xinwei Li
Temporal logic fusion of real time data

Patent number: 10891318

Abstract: A method for temporal logic fusion can include steps of: receiving a plurality of inputs for a plurality of behavior classes, the inputs consisting of single- or multi-dimensional states sampled over time; computing a distance metric pairwise among the inputs, the computing being performed using dynamic time warping; mapping the high-dimension input signals into a 2-dimensional (2-D) space using t-distributed Stochastic Neighbor Embedding, the pairwise computation from the computing step being used as the distance metric required to perform this mapping; clustering the high-dimension input signals in the 2-D space via a k-means clustering algorithm; generating a signal temporal logic (STL) expression that distinguishes between a cluster in a behavior class and all high-dimension input signals not in that behavior class; and repeating the generating step for each cluster in that behavior class.

Type: Grant

Filed: February 22, 2019

Date of Patent: January 12, 2021

Assignee: United States of America as represented by the Secretary of the Navy

Inventor: Andrew K. Winn
Predicting user actions on ubiquitous devices

Patent number: 10885905

Abstract: A method includes that for each model from multiple models, evaluating a model prediction accuracy based on a dataset of a user over a first time duration. The dataset includes a sequence of actions with corresponding contexts based on electronic device interactions. Each model is trained to predict a next action at a time point within the first time duration, based on a first behavior sequence over a first time period from the dataset before the time point, a second behavior sequence over a second time period from the dataset before the time point, and context at the time point. A model is selected from the multiple models based on its model prediction accuracy for the user based on a domain. An action to be initiated at a later time using an electronic device of the user is recommended using the selected model during a second time duration.

Type: Grant

Filed: December 28, 2018

Date of Patent: January 5, 2021

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Vijay Srinivasan, Hongxia Jin
System and computer-executable program code for accelerated rescoring with recurrent neural net language models on hybrid CPU/GPU machines using a frame-wise, delayed dispatch of RNNLM score computation tasks to the GPU(s)

Patent number: 10878806

Abstract: A system and computer-executable program code for accelerated rescoring with recurrent neural net language models (RNNLMs) on hybrid CPU/GPU computer systems utilizes delayed, frame-wise (or layer-wise) batch dispatch of RNNLM scoring tasks to the GPU(s), while performing substantially all other tasks on the CPU(s).

Type: Grant

Filed: December 31, 2018

Date of Patent: December 29, 2020

Assignee: Medallia, Inc.

Inventor: Haozheng Li
Technologies for detection of minority events

Patent number: 10878336

Abstract: Technologies for detecting minority events are disclosed. By performing a guided hierarchical classification algorithm with a decision tree structure and grouping the minority class(es) in with some of the majority classes, large majority classes may be separated from a minority class without requiring good detection of the minority events by themselves. The decision tree structure may be used only for the purpose of identifying if the data sample in question is a member of a minority class. If it is determined that it is not, a primary classification algorithm may be used. With this approach, the guided hierarchical classification algorithm need not perform as well as the primary classification algorithm for the majority events, but may provide improved detection for minority events.

Type: Grant

Filed: June 24, 2016

Date of Patent: December 29, 2020

Assignee: Intel Corporation

Inventors: Varvara Kollia, Ramune Nagisetty
Adaptive artificial neural network selection techniques

Patent number: 10878318

Abstract: Computer-implemented techniques can include obtaining, by a client computing device, a digital media item and a request for a processing task on the digital item and determining a set of operating parameters based on (i) available computing resources at the client computing device and (ii) a condition of a network. Based on the set of operating parameters, the client computing device or a server computing device can select one of a plurality of artificial neural networks (ANNs), each ANN defining which portions of the processing task are to be performed by the client and server computing devices. The client and server computing devices can coordinate processing of the processing task according to the selected ANN. The client computing device can also obtain final processing results corresponding to a final evaluation of the processing task and generate an output based on the final processing results.

Type: Grant

Filed: March 28, 2016

Date of Patent: December 29, 2020

Assignee: Google LLC

Inventors: Matthew Sharifi, Jakob Nicolaus Foerster
Knowledge management system and process for managing knowledge

Patent number: 10872122

Abstract: A knowledge management system includes: a default knowledge system including: a knowledge system and a knowledge database in communication with the knowledge system; and a knowledge store in communication with the default knowledge system and including: a taxonomy amendment, an annotation amendment, a canonicalization amendment, an ecosystem amendment, a term amendment, and a phrase amendment.

Type: Grant

Filed: January 30, 2018

Date of Patent: December 22, 2020

Assignee: GOVERNMENT OF THE UNITED STATES OF AMERICA, AS REPRESENTED BY THE SECRETARY OF COMMERCE

Inventors: John Elliott, Talapady N. Bhat, Ursula R. Kattner, Carelyn E. Campbell, Ram D. Sriram, Eswaran Subrahmanian, Jacob Collard, Ira Monarch
Large margin training for attention-based end-to-end speech recognition

Patent number: 10861441

Abstract: A method of attention-based end-to-end (E2E) automatic speech recognition (ASR) training, includes performing cross-entropy training of a model, based on one or more input features of a speech signal, performing beam searching of the model of which the cross-entropy training is performed, to generate an n-best hypotheses list of output hypotheses, and determining a one-best hypothesis among the generated n-best hypotheses list. The method further includes determining a character-based gradient and a word-based gradient, based on the model of which the cross-entropy training is performed and a loss function in which a distance between a reference sequence and the determined one-best hypothesis is maximized, and performing backpropagation of the determined character-based gradient and the determined word-based gradient to the model, to update the model.

Type: Grant

Filed: February 14, 2019

Date of Patent: December 8, 2020

Assignee: TENCENT AMERICA LLC

Inventors: Peidong Wang, Jia Cui, Chao Weng, Dong Yu
Recognition of out-of-vocabulary in direct acoustics-to-word speech recognition using acoustic word embedding

Patent number: 10839792

Abstract: A method (and structure and computer product) for learning Out-of-Vocabulary (OOV) words in an Automatic Speech Recognition (ASR) system includes using an Acoustic Word Embedding Recurrent Neural Network (AWE RNN) to receive a character sequence for a new OOV word for the ASR system, the RNN providing an Acoustic Word Embedding (AWE) vector as an output thereof. The AWE vector output from the AWE RNN is provided as an input into an Acoustic Word Embedding-to-Acoustic-to-Word Neural Network (AWE?A2W NN) trained to provide an OOV word weight value from the AWE vector. The OOV word weight is inserted into a listing of Acoustic-to-Word (A2W) word embeddings used by the ASR system to output recognized words from an input of speech acoustic features, wherein the OOV word weight is inserted into the A2W word embeddings list relative to existing weights in the A2W word embeddings list.

Type: Grant

Filed: February 5, 2019

Date of Patent: November 17, 2020

Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION, TOYOTA TECHNOLOGICAL INSTITUTE AT CHICAGO

Inventors: Kartik Audhkhasi, Karen Livescu, Michael Picheny, Shane Settle
Sequence-to-sequence convolutional architecture

Patent number: 10839790

Abstract: Exemplary embodiments relate to improvements to neural networks for translation and other sequence-to-sequence tasks. A convolutional neural network may include multiple blocks, each having a convolution layer and gated linear units; gating may determine what information passes through to the next block level. Residual connections, which add the input of a block back to its output, may be applied around each block. Further, an attention may be applied to determine which word is most relevant to translate next. By applying repeated passes of the attention to multiple layers of the decoder, the decoder is able to work on the entire structure of a sentence at once (with no temporal dependency). In addition to better accuracy, this configuration is better at capturing long-range dependencies, better models the hierarchical syntax structure of a sentence, and is highly parallelizable and thus faster to run on hardware.

Type: Grant

Filed: December 20, 2017

Date of Patent: November 17, 2020

Assignee: FACEBOOK, INC.

Inventors: Jonas Gehring, Michael Auli, Yann Nicolas Dauphin, David G. Grangier, Dzianis Yarats
Training device, speech detection device, training method, and computer program product

Patent number: 10839288

Abstract: According to an embodiment, a training device trains a neural network that outputs a posterior probability that an input signal belongs to a particular class. An output layer of the neural network includes N units respectively corresponding to classes and one additional unit. The training device includes a propagator, a probability calculator, and an updater. The propagator supplies a sample signal to the neural network and acquires (N+1) input values for each unit at the output layer. The probability calculator supplies the input values to a function to generate a probability vector including (N+1) probability values respectively corresponding to the units at the output layer. The updater updates a parameter included in the neural network in such a manner to reduce an error between a teacher vector including (N+1) target values and the probability vector. A target value corresponding to the additional unit is a predetermined constant value.

Type: Grant

Filed: September 6, 2016

Date of Patent: November 17, 2020

Assignee: Kabushiki Kaisha Toshiba

Inventor: Yu Nasu
Transfer of an acoustic knowledge to a neural network

Patent number: 10832129

Abstract: A method for transferring acoustic knowledge of a trained acoustic model (AM) to a neural network (NN) includes reading, into memory, the NN and the AM, the AM being trained with target domain data, and a set of training data including a set of phoneme data, the set of training data being data obtained from a domain different from a target domain for the target domain data, inputting training data from the set of training data into the AM, calculating one or more posterior probabilities of context-dependent states corresponding to phonemes in a phoneme class of a phoneme to which each frame in the training data belongs, and generating a posterior probability vector from the one or more posterior probabilities, as a soft label for the NN, and inputting the training data into the NN and updating the NN, using the soft label.

Type: Grant

Filed: October 7, 2016

Date of Patent: November 10, 2020

Assignee: International Business Machines Corporation

Inventors: Takashi Fukuda, Masayuki A. Suzuki, Ryuki Tachibana
System and method for malware detection on a per packet basis

Patent number: 10817608

Abstract: Disclosed is a computer implemented method for malware detection that analyses a file on a per packet basis. The method receives a packet of one or more packets associated a file, and converting a binary content associated with the packet into a digital representation and tokenizing plain text content associated with the packet. The method extracts one or more n-gram features, an entropy feature, and a domain feature from the converted content of the packet and applies a trained machine learning model to the one or more features extracted from the packet. The output of the machine learning method is a probability of maliciousness associated with the received packet. If the probability of maliciousness is above a threshold value, the method determines that the file associated with the received packet is malicious.

Type: Grant

Filed: April 5, 2018

Date of Patent: October 27, 2020

Assignee: Zscaler, Inc.

Inventors: Huihsin Tseng, Hao Xu, Jian L. Zhen
Information processing apparatus for transmitting speech signals selectively to a plurality of speech recognition servers, speech recognition system including the information processing apparatus, and information processing method

Patent number: 10803872

Abstract: An information processing apparatus includes: a speech obtainer which obtains speech of a user; a first controller which, when the first controller recognizes that the speech obtained by the speech obtainer is a first activation word, outputs a speech signal corresponding to the first activation word; and a second controller. In the first speech transmission process in which the speech signal of the speech obtained by speech obtainer is transmitted to the VPA cloud server, the first controller determines whether to output a speech signal corresponding to a second activation word to the second controller based on a predetermined priority level when the first controller recognizes that the speech obtained by the speech obtainer indicates the second activation word for causing the second controller to start a second speech transmission process.

Type: Grant

Filed: February 2, 2018

Date of Patent: October 13, 2020

Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventors: Masayuki Kozuka, Tomoki Ogawa, Yoshihiro Mori
Audio device with wakeup word detection

Patent number: 10789949

Abstract: An audio device with at least one microphone adapted to receive sound from a sound field and create an output, and a processing system that is responsive to the output of the microphone. The processing system is configured to use a signal processing algorithm to detect a wakeup word, and modify the signal processing algorithm that is used to detect the wakeup word if the sound field changes.

Type: Grant

Filed: June 20, 2017

Date of Patent: September 29, 2020

Assignee: Bose Corporation

Inventors: Ricardo Carreras, Alaganandan Ganeshkumar
Optimizing core utilization in neurosynaptic systems

Patent number: 10782726

Abstract: In one embodiment, a computer program product for optimizing core utilization in a neurosynaptic network includes a computer readable storage medium having program instructions embodied therewith, where the computer readable storage medium is not a transitory signal per se, and where the program instructions are executable by a processor to cause the processor to perform a method including identifying, by the processor, one or more unused portions of a neurosynaptic network, and for each of the one or more unused portions of the neurosynaptic network, disconnecting, by the processor, the unused portion from the neurosynaptic network.

Type: Grant

Filed: April 2, 2019

Date of Patent: September 22, 2020

Assignee: International Business Machines Corporation

Inventors: Arnon Amir, Pallab Datta, Nimrod Megiddo, Dharmendra S. Modha
On-device image recognition

Patent number: 10769428

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a second model to approximate the output of a first model to classify, according to a classification scheme, image data received as input, and after the second model is trained accessing map data that specifies a plurality of geographic locations, and for each geographic location associated with an entity for each image of the one or more images that depict the entity located at the geographic location, providing the image to the second model to generate an embedding for the image, associating each of the one or more embeddings generated by the second model with the geographic location, and storing, in a database, location data specifying the geographic location, the associated one or more embeddings, and data specifying the entity, as an associated entity entry for the entity.

Type: Grant

Filed: August 13, 2018

Date of Patent: September 8, 2020

Assignee: Google LLC

Inventors: Abhanshu Sharma, Fedir Zubach, Thomas Binder, Lukas Mach, Sammy El Ghazzal, Matthew Sharifi
Convolutional neural networks

Patent number: 10762894

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for keyword spotting. One of the methods includes training, by a keyword detection system, a convolutional neural network for keyword detection by providing a two-dimensional set of input values to the convolutional neural network, the input values including a first dimension in time and a second dimension in frequency, and performing convolutional multiplication on the two-dimensional set of input values for a filter using a frequency stride greater than one to generate a feature map.

Type: Grant

Filed: July 22, 2015

Date of Patent: September 1, 2020

Assignee: GOOGLE LLC

Inventors: Tara N. Sainath, Maria Carolina Parada San Martin
Speech wakeup method, apparatus, and electronic device

Patent number: 10748524

Abstract: A speech wakeup method, apparatus, and electronic device are disclosed in embodiments of this specification. The method includes: inputting speech data to a speech wakeup model trained with general speech data; and outputting, by the speech wakeup model, a result for determining whether to execute speech wakeup, wherein the speech wakeup model includes a Deep Neural Network (DNN) and a Connectionist Temporal Classifier (CTC).

Type: Grant

Filed: January 28, 2020

Date of Patent: August 18, 2020

Assignee: Alibaba Group Holding Limited

Inventors: Zhiming Wang, Jun Zhou, Xiaolong Li
Training reinforcement learning neural networks

Patent number: 10733504

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a Q network used to select actions to be performed by an agent interacting with an environment. One of the methods includes obtaining a plurality of experience tuples and training the Q network on each of the experience tuples using the Q network and a target Q network that is identical to the Q network but with the current values of the parameters of the target Q network being different from the current values of the parameters of the Q network.

Type: Grant

Filed: September 9, 2016

Date of Patent: August 4, 2020

Assignee: DeepMind Technologies Limited

Inventors: Hado Philip van Hasselt, Arthur Clément Guez

prev 1 2 3 4 5 6 7 … next