Neural Network Patents (Class 704/202)

Data input system with online learning

Patent number: 12260028

Abstract: A data input system is described for inputting text items to an electronic device. The data input system has a store holding a vocabulary of embeddings of text items, each embedding being a numerical encoding of a text item. The data input system receives user input comprising a sequence of one or more context text items and a new text item, the new text item being a text item with an embedding to be computed and added to the vocabulary or with an embedding already in the vocabulary and to be updated. A neural network predictor predicts a next text item in the sequence given the context text items and the vocabulary. An online training module updates the vocabulary by using either a direction associated with the predicted next item, or, by comparing the new text item and the predicted next text item.

Type: Grant

Filed: March 30, 2017

Date of Patent: March 25, 2025

Assignee: Microsoft Technology Licensing, LLC

Inventors: Douglas Alexander Harper Orr, Juha Iso-Sipila, Marco Fiscato, Matthew James Willson, Joseph Osborne
Fast emit low-latency streaming ASR with sequence-level emission regularization utilizing forward and backward probabilities between nodes of an alignment lattice

Patent number: 12094453

Abstract: A computer-implemented method of training a streaming speech recognition model that includes receiving, as input to the streaming speech recognition model, a sequence of acoustic frames. The streaming speech recognition model is configured to learn an alignment probability between the sequence of acoustic frames and an output sequence of vocabulary tokens. The vocabulary tokens include a plurality of label tokens and a blank token. At each output step, the method includes determining a first probability of emitting one of the label tokens and determining a second probability of emitting the blank token. The method also includes generating the alignment probability at a sequence level based on the first probability and the second probability. The method also includes applying a tuning parameter to the alignment probability at the sequence level to maximize the first probability of emitting one of the label tokens.

Type: Grant

Filed: September 9, 2021

Date of Patent: September 17, 2024

Assignee: Google LLC

Inventors: Jiahui Yu, Chung-cheng Chiu, Bo Li, Shuo-yiin Chang, Tara Sainath, Wei Han, Anmol Gulati, Yanzhang He, Arun Narayanan, Yonghui Wu, Ruoming Pang
Systems and methods for compression and acceleration of convolutional neural networks

Patent number: 12073306

Abstract: Systems and methods are disclosed for a centrosymmetric convolutional neural network (CSCNN), an algorithm/hardware co-design framework for CNN compression and acceleration that mitigates the effects of computational irregularity and effectively exploits computational reuse and sparsity for increased performance and energy efficiency.

Type: Grant

Filed: December 15, 2021

Date of Patent: August 27, 2024

Assignee: THE GEORGE WASHINGTON UNIVERSITY

Inventors: Jiajun Li, Ahmed Louri
Combined learning method and apparatus using deepening neural network based feature enhancement and modified loss function for speaker recognition robust to noisy environments

Patent number: 12067989

Abstract: Presented are a combined learning method and device using a transformed loss function and feature enhancement based on a deep neural network for speaker recognition that is robust in a noisy environment. A combined learning method using a transformed loss function and feature enhancement based on a deep neural network, according to one embodiment, can comprise the steps of: learning a feature enhancement model based on a deep neural network; learning a speaker feature vector extraction model based on the deep neural network; connecting an output layer of the feature enhancement model with an input layer of the speaker feature vector extraction model; and considering the connected feature enhancement model and speaker feature vector extraction model as one mode and performing combined learning for additional learning.

Type: Grant

Filed: March 30, 2020

Date of Patent: August 20, 2024

Assignee: IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY)

Inventors: Joon-Hyuk Chang, Joonyoung Yang
Highly performant pipeline parallel deep neural network training

Patent number: 12056604

Abstract: Layers of a deep neural network (DNN) are partitioned into stages using a profile of the DNN. Each of the stages includes one or more of the layers of the DNN. The partitioning of the layers of the DNN into stages is optimized in various ways including optimizing the partitioning to minimize training time, to minimize data communication between worker computing devices used to train the DNN, or to ensure that the worker computing devices perform an approximately equal amount of the processing for training the DNN. The stages are assigned to the worker computing devices. The worker computing devices process batches of training data using a scheduling policy that causes the workers to alternate between forward processing of the batches of the DNN training data and backward processing of the batches of the DNN training data. The stages can be configured for model parallel processing or data parallel processing.

Type: Grant

Filed: June 29, 2018

Date of Patent: August 6, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Vivek Seshadri, Amar Phanishayee, Deepak Narayanan, Aaron Harlap, Nikhil Devanur Rangarajan
Method and apparatus for distributed and cooperative computation in artificial neural networks

Patent number: 12032653

Abstract: An apparatus and method are described for distributed and cooperative computation in artificial neural networks. For example, one embodiment of an apparatus comprises: an input/output (I/O) interface; a plurality of processing units communicatively coupled to the I/O interface to receive data for input neurons and synaptic weights associated with each of the input neurons, each of the plurality of processing units to process at least a portion of the data for the input neurons and synaptic weights to generate partial results; and an interconnect communicatively coupling the plurality of processing units, each of the processing units to share the partial results with one or more other processing units over the interconnect, the other processing units using the partial results to generate additional partial results or final results. The processing units may share data including input neurons and weights over the shared input bus.

Type: Grant

Filed: May 3, 2021

Date of Patent: July 9, 2024

Assignee: Intel Corporation

Inventors: Frederico C. Pratas, Ayose J. Falcon, Marc Lupon, Fernando Latorre, Pedro Lopez, Enric Herrero Abellanas, Georgios Tournavitis
Electronic infrastructure for digital content delivery and/or online assessment management

Patent number: 11990057

Abstract: Briefly, example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to facilitate and/or support one or more operations and/or techniques for electronic infrastructure for digital content delivery and/or online assessment management, such as implemented, at least in part, via one or more computing and/or communication networks and/or protocols.

Type: Grant

Filed: February 14, 2020

Date of Patent: May 21, 2024

Assignee: ARH TECHNOLOGIES, LLC

Inventors: Alan R. Hollander, Micky McCuen
Applying compression profiles across similar neural network architectures

Patent number: 11809992

Abstract: Neural networks with similar architectures may be compressed using shared compression profiles. A request to compress a trained neural network may be received and an architecture of the neural network identified. The identified architecture may be compared with the different network architectures mapped to compression profiles to select a compression profile for the neural network. The compression profile may be applied to remove features of the neural network to generate a compressed version of the neural network.

Type: Grant

Filed: March 31, 2020

Date of Patent: November 7, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Gurumurthy Swaminathan, Ragav Venkatesan, Xiong Zhou, Runfei Luo, Vineet Khare
Attentive scoring function for speaker identification

Patent number: 11798562

Abstract: A speaker verification method includes receiving audio data corresponding to an utterance, processing the audio data to generate a reference attentive d-vector representing voice characteristics of the utterance, the evaluation ad-vector includes ne style classes each including a respective value vector concatenated with a corresponding routing vector. The method also includes generating using a self-attention mechanism, at least one multi-condition attention score that indicates a likelihood that the evaluation ad-vector matches a respective reference ad-vector associated with a respective user. The method also includes identifying the speaker of the utterance as the respective user associated with the respective reference ad-vector based on the multi-condition attention score.

Type: Grant

Filed: May 16, 2021

Date of Patent: October 24, 2023

Assignee: Google LLC

Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Yiling Huang, Mert Saglam
Data storage method for speech-related DNN operations

Patent number: 11734551

Abstract: A data storage method for speech-related deep neural network (DNN) operations, characterized by comprising the following steps: 1. determining the configuration parameters by a user; 2. configuring a peripheral storage access interface; 3. configuring a multi-transmitting interface of feature storage array; 4. enabling CPU to store to-be-calculated data in a storage space between the feature storage space start address and the feature storage space end address of the peripheral storage device; 5. after data storage, enabling CPU to check the state of the peripheral storage access interface and the multi-transmitting interface of feature storage array; 6. upon receiving a transportation completion signal of the peripheral storage access interface by CPU, enabling the multi-transmitting interface of feature storage array. 7. upon receiving a transportation completion signal of the multi-transmitting interface of feature storage array by CPU, repeating step 6.

Type: Grant

Filed: December 10, 2021

Date of Patent: August 22, 2023

Assignee: CHIPINTELLI TECHNOLOGY CO., LTD

Inventors: Zhaoqiang Qiu, Lai Zhang, Fujun Wang, Wei Tian, Yingbin Yang, Yangyang Pei
Wakeword and acoustic event detection

Patent number: 11670299

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

Type: Grant

Filed: May 17, 2021

Date of Patent: June 6, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Ming Sun, Thibaud Senechai, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
Systems and methods for neural network pruning with accuracy preservation

Patent number: 11636343

Abstract: Training a neural network (NN) may include training a NN N, and for S, a version of N to be sparsified (e.g. a copy of N), removing NN elements from S to create a sparsified version of S, and training S using outputs from N (e.g. “distillation”). A boosting or reintroduction phase may follow sparsification: training a NN may include for a trained NN N and S, a sparsified version of N, re-introducing NN elements previously removed from S, and training S using outputs from N. The boosting phase need not use a NN sparsified by “distillation.” Training and sparsification, or training and reintroduction, may be performed iteratively or over repetitions.

Type: Grant

Filed: September 26, 2019

Date of Patent: April 25, 2023

Assignee: Neuralmagic Inc.

Inventor: Dan Alistarh
Intelligent selection of time series models

Patent number: 11620493

Abstract: Various embodiments are provided for intelligent selection of time series models by one or more processors in a computing system. Time series data may be received from a user, one or more computing devices, sensors, or a combination thereof. One or more optimal time series models may be selected upon using and/or evaluating one or more recurrent neural networks models that are trained or pre-trained using simulated time series data or historical time series data, or a combination thereof for one or more predictive analytical tasks relating to the received time series data.

Type: Grant

Filed: October 7, 2019

Date of Patent: April 4, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Beat Buesser, Bei Chen, Kelsey Dipietro
Method and system for generating a dynamic sequence of actions

Patent number: 11526751

Abstract: A device may receive historical data and real-time data associated with a troubleshooting service, identify, using a machine learning model, an optimal resolution based on the historical data and the real-time data, and identify, using a graph analytics model, an optimal path of actions based on the optimal resolution. The machine learning model may be trained to identify one of the set of historical issues associated with the unresolved issue, and identify the optimal resolution based on one of the set of historical resolutions associated with the one of the set of historical issues. The graph analytics model may be trained to generate a set of paths of actions based on the historical data, and identify the optimal path based on respective numbers of actions associated with the set of paths. The device may identify optimal action based on the optimal path and the prior action.

Type: Grant

Filed: November 25, 2019

Date of Patent: December 13, 2022

Assignee: Verizon Patent and Licensing Inc.

Inventors: Sumit Singh, Balagangadhara Thilak Adiboina, Adithya Umakanth, Ganesh Narasimman, Sambasiva R Bhatta, Anurag Pant
Small-footprint flow-based models for raw audio

Patent number: 11521592

Abstract: WaveFlow is a small-footprint generative flow for raw audio, which may be directly trained with maximum likelihood. WaveFlow handles the long-range structure of waveform with a dilated two-dimensional (2D) convolutional architecture, while modeling the local variations using expressive autoregressive functions. WaveFlow may provide a unified view of likelihood-based models for raw audio, including WaveNet and WaveGlow, which may be considered special cases. It generates high-fidelity speech, while synthesizing several orders of magnitude faster than existing systems since it uses only a few sequential steps to generate relatively long waveforms. WaveFlow significantly reduces the likelihood gap that has existed between autoregressive models and flow-based models for efficient synthesis. Its small footprint with 5.91M parameters makes it 15 times smaller than some existing models. WaveFlow can generate 22.05 kHz high-fidelity audio 42.

Type: Grant

Filed: August 5, 2020

Date of Patent: December 6, 2022

Assignee: Baidu USA LLC

Inventors: Wei Ping, Kainan Peng, Kexin Zhao, Zhao Song
Characterizing, selecting and adapting audio and acoustic training data for automatic speech recognition systems

Patent number: 11482241

Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.

Type: Grant

Filed: March 27, 2017

Date of Patent: October 25, 2022

Assignee: Nuance Communications, Inc

Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
Method and system for disease quantification modeling of anatomical tree structure

Patent number: 11462326

Abstract: A method and system can be used for disease quantification modeling of an anatomical tree structure. The method may include obtaining a centerline of an anatomical tree structure and generating a graph neural network including a plurality of nodes based on a graph. Each node corresponds to a centerline point and edges are defined by the centerline, with an input of each node being a disease related feature or an image patch for the corresponding centerline point and an output of each node being a disease quantification parameter. The method also includes obtaining labeled data of one or more nodes, the number of which is less than a total number of the nodes in the graph neural network. Further, the method includes training the graph neural network by transferring information between the one or more nodes and other nodes based on the labeled data of the one or more nodes.

Type: Grant

Filed: June 19, 2020

Date of Patent: October 4, 2022

Assignee: KEYA MEDICAL TECHNOLOGY CO., LTD.

Inventors: Xin Wang, Youbing Yin, Junjie Bai, Qi Song, Kunlin Cao, Yi Lu, Feng Gao
System and method for reducing noise components in a live audio stream

Patent number: 11462229

Abstract: This disclosure relates generally to a system and method to identify a plurality of noises or their combination to suppress them and enhancing the deteriorated input signal in a dynamic manner. It identifies noises in the audio signal and categorizing them based on the trained database of noises. A combination of deep neural network (DNN) and artificial Intelligence (AI) helps the system for self-learning to understand and capture noises in the environment and retain the model to reduce noises from the next attempt. The system suppresses unwanted noise coming from the external environment with the help of AI based algorithms, by understanding, differentiating, and enhancing human voice in a live environment. The system will help in the reduction of unwanted noises and enhance the experience of business and public meetings, video conferences, musical events, speech broadcasts etc. that could cause distractions, disturbances and create barriers in the conversation.

Type: Grant

Filed: March 6, 2020

Date of Patent: October 4, 2022

Assignee: TATA CONSULTANCY SERVICES LIMITED

Inventors: Robin Tommy, Reshmi Ravindranathan, Navin Infant Raj, Venkatakrishna Akula, Jithin Laiju Ravi, Anita Nanadikar, Anil Kumar Sharma, Pranav Champaklal Shah, Bhasha Prasad Khose
Method and device for recognizing state of meridian

Patent number: 11410674

Abstract: The present application relates to a method and device for recognizing the state of a human body meridian by utilizing a voice recognition technology, the method comprising: receiving an input voice of a user; preprocessing the input voice; extracting a stable feature of the preprocessed input voice; primarily classifying the stable feature on the basis of a feature recognition model, and determining a basic classification pitch, wherein the basic classification pitch comprises Gong, Shang, Jue, Zhi and Yu (respectively equivalent to do, re, mi, sol and la); secondarily classifying the stable feature on the basis of the feature recognition model, and determining a secondary classification tone in the basic classification pitch; and recognizing the state of a meridian according to the secondary classification tone.

Type: Grant

Filed: October 23, 2019

Date of Patent: August 9, 2022

Inventor: Zhonghua Ci
Target detection method and apparatus

Patent number: 11380114

Abstract: A method of detecting a target includes generating an image pyramid based on an image on which a detection is to be performed; classifying candidate areas in the image pyramid using a cascade neural network; and determining a target area corresponding to a target included in the image based on the plurality of candidate areas, wherein the cascade neural network includes a plurality of neural networks, and at least one neural network among the neural networks includes parallel sub-neural networks.

Type: Grant

Filed: April 15, 2020

Date of Patent: July 5, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Biao Wang, Chao Zhang, Changkyu Choi, Deheng Qian, Jae-Joon Han, Jingtao Xu, Hao Feng
On-the-fly deep learning in machine learning at autonomous machines

Patent number: 11354542

Abstract: A mechanism is described for facilitating on-the-fly deep learning in machine learning for autonomous machines. A method of embodiments, as described herein, includes detecting an output associated with a first deep network serving as a user-independent model associated with learning of one or more neural networks at a computing device having a processor coupled to memory. The method may further include automatically generating training data for a second deep network serving as a user-dependent model, where the training data is generated based on the output. The method may further include merging the user-independent model with the user-dependent model into a single joint model.

Type: Grant

Filed: February 6, 2020

Date of Patent: June 7, 2022

Assignee: Intel Corporation

Inventor: Raanan Yonatan Yehezkel Rohekar
Speech recognition method and apparatus

Patent number: 11348572

Abstract: A speech recognition method includes obtaining an acoustic sequence divided into a plurality of frames, and determining pronunciations in the acoustic sequence by predicting a duration of a same pronunciation in the acoustic sequence and skipping a pronunciation prediction for a frame corresponding to the duration.

Type: Grant

Filed: July 18, 2018

Date of Patent: May 31, 2022

Assignees: Samsung Electronics Co., Ltd., UNIVERSITE DE MONTREAL

Inventors: Inchul Song, Junyoung Chung, Taesup Kim, Sanghyun Yoo
Noise-resistant object detection with noisy annotations

Patent number: 11334766

Abstract: Systems and methods are provided for training object detectors of a neural network model with a mixture of label noise and bounding box noise. According to some embodiments, a learning framework is provided which jointly optimizes object labels, bounding box coordinates, and model parameters by performing alternating noise correction and model training. In some embodiments, to disentangle label noise and bounding box noise, a two-step noise correction method is employed. In some examples, the first step performs class-agnostic bounding box correction by minimizing classifier discrepancy and maximizing region objectness. In some examples, the second step uses dual detection heads for label correction and class-specific bounding box refinement.

Type: Grant

Filed: January 31, 2020

Date of Patent: May 17, 2022

Assignee: salesforce.com, inc.

Inventors: Junnan Li, Chu Hong Hoi
Approach photographing device and method for controlling the same

Patent number: 11321866

Abstract: A method of controlling audio collection for an image capturing device can include receiving image data from an image capturing device; recognizing one or more objects from the image data; determining a first object having a possibility of generating audio among the one or more objects; and collecting audio from the first object by moving a microphone beamforming direction of the image capturing device to be directed toward the first object in response to a determination that the first object is an object having a possibility of generating audio.

Type: Grant

Filed: April 7, 2020

Date of Patent: May 3, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Taehyun Kim, Ji Chan Maeng
Multi-band synchronized neural vocoder

Patent number: 11295751

Abstract: An apparatus and a method include receiving an input audio signal to be processed by a multi-band synchronized neural vocoder. The input audio signal is separated into a plurality of frequency bands. A plurality of audio signals corresponding to the plurality of frequency bands is obtained. Each of the audio signals is downsampled, and processed by the multi-band synchronized neural vocoder. An audio output signal is generated.

Type: Grant

Filed: September 20, 2019

Date of Patent: April 5, 2022

Assignee: TENCENT AMERICA LLC

Inventors: Chengzhu Yu, Meng Yu, Heng Lu, Dong Yu
Text recommendation method and apparatus, and electronic device

Patent number: 11182564

Abstract: Embodiments of this application provide a text recommendation method performed at an electronic device. The method includes: extracting feature content of from the a target text; processing the feature content by using at least two text analysis models to obtain at least two semantic vectors; integrating the at least two semantic vectors into an integrated semantic vector of the target text; selecting, according to the integrated semantic vector and an integrated semantic vector of at least one to-be-recommended text, a recommended text corresponding to the target text from the at least one to-be-recommended text. Because the integrated semantic vector of the target text is obtained based on the at least two text analysis models, the integrated semantic vector has a stronger representing capability. When text recommendation is subsequently performed, an association degree between the recommended text and the target text can be increased, thereby improving recommendation accuracy.

Type: Grant

Filed: April 14, 2020

Date of Patent: November 23, 2021

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Bingfeng Li, Xin Fan, Xiaoqiang Feng, Biao Li
Perception device

Patent number: 11176416

Abstract: A perception device includes: a first neural network that performs a common process associated with perception of an object and thus outputs results of the common process; a second neural network that receives an output of the first neural network and outputs results of a first perception process of perceiving the characteristics of the object with a first accuracy; and a third neural network that receives the output of the first neural network and intermediate data which is generated by the second neural network in the course of the first perception process and outputs results of a second perception process of perceiving the characteristics of the object with a second accuracy which is higher than the first accuracy.

Type: Grant

Filed: April 23, 2018

Date of Patent: November 16, 2021

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventor: Daisuke Hashimoto
Method of detecting anomalies in waveforms, and system thereof

Patent number: 11137323

Abstract: A method and system for detecting anomalies in waveforms in an industrial plant. During a learning stage, one or more training waveforms are received from sensors monitoring a plurality of equipment in the industrial plant. The one or more training waveforms are used to generate a representative waveform and deviations of the one or more training waveforms from the representative waveform are determined. Based on the deviations, groups are created. A model may be associated with each group for building an expected waveform pattern. When test waveforms are received, based on the electrical and physical properties of the test waveforms, each test waveform is classified into one of the groups. Thereafter, each waveform is compared with the expected waveform pattern associated with the group to which the respective test waveform belongs, to detect the anomaly.

Type: Grant

Filed: November 12, 2018

Date of Patent: October 5, 2021

Assignees: KABUSHIKI KAISHA TOSHIBA, Toshiba Memory Corporation

Inventors: Sai Prem Kumar Ayyagari, Arun Kumar Kalakanti, Topon Paul, Shigeru Maya, Takeichiro Nishikawa
Wakeword and acoustic event detection

Patent number: 11132990

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

Type: Grant

Filed: June 26, 2019

Date of Patent: September 28, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
On-the-fly deep learning in machine learning at autonomous machines

Patent number: 11120304

Abstract: A mechanism is described for facilitating the transfer of features learned by a context independent pre-trained deep neural network to a context dependent neural network. The mechanism includes extracting a feature learned by a first deep neural network (DNN) model via the framework, wherein the first DNN model is a pre-trained DNN model for computer vision to enable context-independent classification of an object within an input video frame and training, via the deep learning framework, a second DNN model for computer vision based on the extracted feature, the second DNN model an update of the first DNN model, wherein training the second DNN model includes training the second DNN model based on a dataset including context-dependent data.

Type: Grant

Filed: July 15, 2020

Date of Patent: September 14, 2021

Assignee: Intel Corporation

Inventor: Raanan Yonatan Yehezkel Rohekar
Training method of hybrid frequency acoustic recognition model, and speech recognition method

Patent number: 11120789

Abstract: The invention discloses a training method and a speech recognition method for a mixed frequency acoustic recognition model, which belongs to the technical field of speech recognition.

Type: Grant

Filed: January 26, 2018

Date of Patent: September 14, 2021

Assignee: YUTOU TECHNOLOGY (HANGZHOU) CO., LTD.

Inventor: Lichun Fan
Wakeword and acoustic event detection

Patent number: 11043218

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

Type: Grant

Filed: June 26, 2019

Date of Patent: June 22, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
Acoustic model training method, speech recognition method, apparatus, device and medium

Patent number: 11030998

Abstract: An acoustic model training method, a speech recognition method, an apparatus, a device and a medium. The acoustic model training method comprises: performing feature extraction on a training speech signal to obtain an audio feature sequence; training the audio feature sequence by a phoneme mixed Gaussian Model-Hidden Markov Model to obtain a phoneme feature sequence; and training the phoneme feature sequence by a Deep Neural Net-Hidden Markov Model-sequence training model to obtain a target acoustic model. The acoustic model training method can effectively save time required for an acoustic model training, improve the training efficiency, and ensure the recognition efficiency.

Type: Grant

Filed: August 31, 2017

Date of Patent: June 8, 2021

Assignee: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.

Inventors: Hao Liang, Jianzong Wang, Ning Cheng, Jing Xiao
System and method for creating timbres

Patent number: 11017788

Abstract: A method of building a new voice having a new timbre using a timbre vector space includes receiving timbre data filtered using a temporal receptive field. The timbre data is mapped in the timbre vector space. The timbre data is related to a plurality of different voices. Each of the plurality of different voices has respective timbre data in the timbre vector space. The method builds the new timbre using the timbre data of the plurality of different voices using a machine learning system.

Type: Grant

Filed: April 13, 2020

Date of Patent: May 25, 2021

Assignee: Modulate, Inc.

Inventors: William Carter Huffman, Michael Pappas
Method for automatic affective state inference and an automated affective state inference system

Patent number: 10991384

Abstract: A method for automatic affective state inference from speech signals and an automated affective state interference system are disclosed. In an embodiment the method includes capturing speech signals of a target speaker, extracting one or more acoustic voice parameters from the captured speech signals, calibrating voice markers on basis of the one or more acoustic voice parameters that have been extracted from the speech signals of the target speaker, one or more speaker-inherent reference parameters of the target speaker and one or more inter-speaker reference parameters of a sample of reference speakers, applying at least one set of prediction rules that are based on an appraisal criteria to the calibrated voice markers for inferring two or more appraisal criteria scores relating to appraisal of affect-eliciting events with which the target speaker is confronted and assigning one or more affective state terms to the two or more appraisal criteria scores.

Type: Grant

Filed: April 20, 2018

Date of Patent: April 27, 2021

Assignee: audEERING GMBH

Inventors: Florian Eyben, Klaus R. Scherer, Björn W. Schuller
Linear prediction analysis device, method, program, and storage medium

Patent number: 10909996

Abstract: An autocorrelation calculation unit 21 calculates an autocorrelation RO(i) from an input signal. A prediction coefficient calculation unit 23 performs linear prediction analysis by using a modified autocorrelation R?O(i) obtained by multiplying a coefficient wO(i) by the autocorrelation RO(i). It is assumed here, for each order i of some orders i at least, that the coefficient wO(i) corresponding to the order i is in a monotonically increasing relationship with an increase in a value that is negatively correlated with a fundamental frequency of the input signal of the current frame or a past frame.

Type: Grant

Filed: July 16, 2014

Date of Patent: February 2, 2021

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Yutaka Kamamoto, Takehiro Moriya, Noboru Harada
Speech noise reduction method and device based on artificial intelligence and computer device

Patent number: 10867618

Abstract: Embodiments of the present disclosure provide a speech noise reduction method and a speech noise reduction device based on artificial intelligence and a computer device. The method includes the followings. A first noisy speech to be processed is received. The first noisy speech to be processed is pre-processed, to obtain the first noisy speech in a preset format. The first noisy speech in the preset format is sampled according to a sampling rate indicated by the preset format, to obtain first sampling point information of the first noisy speech. A noise reduction is performed on the first sampling point information through a deep-learning noise reduction model, to generate noise-reduced first sampling point information. A first clean speech is generated according to the noise-reduced first sampling point information.

Type: Grant

Filed: December 28, 2017

Date of Patent: December 15, 2020

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Wei Zou, Xiangang Li, Weiwei Cui, Jingyuan Hu
Method and apparatus for packet loss concealment using generative adversarial network

Patent number: 10861466

Abstract: Disclosed are a packet loss concealment method and apparatus a using a generative adversarial network. A method for packet loss concealment in voice communication may include training a classification model based on a generative adversarial network (GAN) with respect to a voice signal including a plurality of frames, training a generative model having a contention relation with the classification model based on the GAN, estimating lost packet information based on the trained generative model with respect to the voice signal encoded by a codec, and restoring a lost packet based on the estimated packet information.

Type: Grant

Filed: August 9, 2018

Date of Patent: December 8, 2020

Assignee: INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY

Inventors: Joon-Hyuk Chang, Bong-Ki Lee
Method and apparatus for detecting starting point and finishing point of speech, computer device and storage medium

Patent number: 10825470

Abstract: The present disclosure provides a method and apparatus for detecting a starting point and a finishing point of a speech, a computer device and a storage medium, wherein the method comprises: obtaining speech data to be detected; segmenting the speech data into speech segments, the number of speech segments being greater than one; respectively determining speech states of respective speech segments based on a Voice Activity Detection model obtained by pre-training; determining a starting point and a finishing point of the speech data according to the speech states. The solution of the present disclosure can be employed to improve the accuracy of the detection results.

Type: Grant

Filed: December 12, 2018

Date of Patent: November 3, 2020

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Chao Li, Weixin Zhu
Speech analysis algorithmic system and method for objective evaluation and/or disease detection

Patent number: 10796715

Abstract: Systems and methods use patient speech samples as inputs, use subjective multi-point ratings by speech-language pathologists of multiple perceptual dimensions of patient speech samples as further inputs, and extract laboratory-implemented features from the patient speech samples. A predictive software model learns the relationship between speech acoustics and the subjective ratings of such speech obtained from speech-language pathologists, and is configured to apply this information to evaluate new speech samples. Outputs may include objective evaluation of the plurality of perceptual dimensions for new speech samples and/or evaluation of disease onset, disease progression, or disease treatment efficacy for a condition involving dysarthria as a symptom, utilizing the new speech samples.

Type: Grant

Filed: September 1, 2017

Date of Patent: October 6, 2020

Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY

Inventors: Visar Berisha, Ming Tu, Alan Wisler, Julie Liss
Audio compression using an artificial neural network

Patent number: 10714118

Abstract: In one embodiment, a method includes accessing a voice signal from a first user; compressing the voice signal using a compression portion of an artificial neural network trained to compress the first user's voice; and sending the compressed voice signal to a second client computing device.

Type: Grant

Filed: December 30, 2016

Date of Patent: July 14, 2020

Assignee: Facebook, Inc.

Inventor: Pasha Sadri
Low-frequency emphasis for LPC-based coding in frequency domain

Patent number: 10692513

Abstract: The invention provides an audio encoder including a combination of a linear predictive coding filter having a plurality of linear predictive coding coefficients and a time-frequency converter, wherein the combination is configured to filter and to convert a frame of the audio signal into a frequency domain in order to output a spectrum based on the frame and on the linear predictive coding coefficients; a low frequency emphasizer configured to calculate a processed spectrum based on the spectrum, wherein spectral lines of the processed spectrum representing a lower frequency than a reference spectral line are emphasized; and a control device configured to control the calculation of the processed spectrum by the low frequency emphasizer depending on the linear predictive coding coefficients of the linear predictive coding filter.

Type: Grant

Filed: April 18, 2018

Date of Patent: June 23, 2020

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Stefan Doehla, Bernhard Grill, Christian Helmrich, Nikolaus Rettelbach
Systems and methods for principled bias reduction in production speech models

Patent number: 10657955

Abstract: Described herein are systems and methods to identify and address sources of bias in an end-to-end speech model. In one or more embodiments, the end-to-end model may be a recurrent neural network with two 2D-convolutional input layers, followed by multiple bidirectional recurrent layers and one fully connected layer before a softmax layer. In one or more embodiments, the network is trained end-to-end using the CTC loss function to directly predict sequences of characters from log spectrograms of audio. With optimized recurrent layers and training together with alignment information, some unwanted bias induced by using purely forward only recurrences may be removed in a deployed model.

Type: Grant

Filed: January 30, 2018

Date of Patent: May 19, 2020

Assignee: Baidu USA LLC

Inventors: Eric Battenberg, Rewon Child, Adam Coates, Christopher Fougner, Yashesh Gaur, Jiaji Huang, Heewoo Jun, Ajay Kannan, Markus Kliegl, Atul Kumar, Hairong Liu, Vinay Rao, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu
Device-specific multi-channel data compression neural network

Patent number: 10490198

Abstract: A sensor device may include a computing device in communication with multiple microphones. A neural network executing on the computing device may receive audio signals from each microphone. One microphone signal may serve as a reference signal. The neural network may extract differences in signal characteristics of the other microphone signals as compared to the reference signal. The neural network may combine these signal differences into a lossy compressed signal. The sensor device may transmit the lossy compressed signal and the lossless reference signal to a remote neural network executing in a cloud computing environment for decompression and sound recognition analysis.

Type: Grant

Filed: December 18, 2017

Date of Patent: November 26, 2019

Assignee: GOOGLE LLC

Inventors: Chanwoo Kim, Rajeev Conrad Nongpiur, Tara Sainath
Automatically suggesting resources for responding to a request

Patent number: 10453074

Abstract: A user may respond to a request of another user by entering text, such as a customer service representative responding to a customer. Suggestions of resources may be provided to the responding user to assist the responding user in providing a response. For example, a resource may provide information to the responding user or allow the responding user to perform an action. Previous messages between the two users and other information may be used to select a resource. A conversation feature vector may be determined from previous messages, and feature vectors may be determined from the resources. The conversation feature vector and the feature vectors determined from the resource may be used to select a resource to suggest to the responding user.

Type: Grant

Filed: September 1, 2016

Date of Patent: October 22, 2019

Assignee: ASAPP, INC.

Inventors: Gustavo Sapoznik, Shawn Henry
Method and device for eliminating background sound, and terminal device

Patent number: 10381017

Abstract: The present disclosure provides a method and a device for eliminating background sound, and a terminal device. The method includes: obtaining an initial audio data set; performing background sound fusion processing on the initial audio data set to obtain training sample data; performing neural network training based on the training sample data and the initial audio data set to generate an initial neural network model for eliminating background sound; and performing background sound elimination on audio data to be processed based on the initial neural network model for eliminating background sound.

Type: Grant

Filed: July 25, 2018

Date of Patent: August 13, 2019

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Xuewei Zhang, Xiangang Li
System and method for applying tracing tools for network locations

Patent number: 10346859

Abstract: A method is disclosed for enabling a network location to provide an ordering process for data relevant to connected network devices' activities. The method includes assembling the data, utilizing the activity data, and associating the data, such that information is derived to enable a desired expansion of at least one designated activity. Another method is disclosed for managing an object assignment broadcast operations for a network location based on a network device's previous activities. This second method includes tracing a network device's conduct to determine that a network device prefers a particular class of content. The method also includes tagging a network device's profile with the respective observation and deciding by a network location as to the classification factor for a network device to be targeted for an object assignment broadcast.

Type: Grant

Filed: January 2, 2018

Date of Patent: July 9, 2019

Assignee: LIVEPERSON, INC.

Inventors: Haggai Shachar, Shahar Nechmad
Data processing device, data processing method, and computer program product

Patent number: 10269355

Abstract: According to an embodiment, a data processing device generates result data which represents a result of performing predetermined processing on series data. The device includes an upper-level processor and a lower-level processor. The upper-level processor attaches order information to data blocks constituting the series data. The lower-level processor performs lower-level processing on the data blocks having the order information attached thereto, and attaches common order information, which is in common with the data blocks, to values obtained as a result of the lower-level processing. The upper-level processor integrates the values, which have the common order information attached thereto, based on the common order information and performs upper-level processing to generate the result data.

Type: Grant

Filed: March 15, 2016

Date of Patent: April 23, 2019

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Shoko Miyamori, Takashi Masuko, Mitsuyoshi Tachimori, Kouji Ueno, Manabu Nagao
Directional microphone device and signal processing techniques

Patent number: 10237649

Abstract: Methods and apparatus relating to microphone devices and signal processing techniques are provided. In an example, a microphone device can detect sound, as well as enhance an ability to perceive at least a general direction from which the sound arrives at the microphone device. In an example, a case of the microphone device has an external surface which at least partially defines funnel-shaped surfaces. Each funnel-shaped surface is configured to direct the sound to a respective microphone diaphragm to produce an auralized multi-microphone output. The funnel-shaped surfaces are configured to cause direction-dependent variations in spectral notches and frequency response of the sound as received by the microphone diaphragms. A neural network can device-shape the auralized multi-microphone output to create a binaural output. The binaural output can be auralized with respect to a human listener.

Type: Grant

Filed: December 18, 2017

Date of Patent: March 19, 2019

Assignee: Google LLC

Inventor: Rajeev Conrad Nongpiur
Method for determining whether object is in target area, and parking management device

Patent number: 10217365

Abstract: The present disclosure relates to a method for determining whether an object is within a target area, a parking management device, a parking management system and an electronic device. The method comprises the following steps: acquiring an intensity value of a wireless signal that the object receives from a signal transmitting apparatus which is provided on the site of the target area; and determining whether the object is within the target area based on the intensity value of the wireless signal. The method for determining whether the object is within the target area can be applied to guide a user to park a vehicle within a parking area.

Type: Grant

Filed: October 27, 2017

Date of Patent: February 26, 2019

Assignee: BEIJING MOBIKE TECHNOLOGY CO., LTD.

Inventors: Chaochao Chen, Zirong Guo, Yujie Yang

1 2 3 next