Neural Network Patents (Class 704/202)
  • Patent number: 11809992
    Abstract: Neural networks with similar architectures may be compressed using shared compression profiles. A request to compress a trained neural network may be received and an architecture of the neural network identified. The identified architecture may be compared with the different network architectures mapped to compression profiles to select a compression profile for the neural network. The compression profile may be applied to remove features of the neural network to generate a compressed version of the neural network.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: November 7, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Gurumurthy Swaminathan, Ragav Venkatesan, Xiong Zhou, Runfei Luo, Vineet Khare
  • Patent number: 11798562
    Abstract: A speaker verification method includes receiving audio data corresponding to an utterance, processing the audio data to generate a reference attentive d-vector representing voice characteristics of the utterance, the evaluation ad-vector includes ne style classes each including a respective value vector concatenated with a corresponding routing vector. The method also includes generating using a self-attention mechanism, at least one multi-condition attention score that indicates a likelihood that the evaluation ad-vector matches a respective reference ad-vector associated with a respective user. The method also includes identifying the speaker of the utterance as the respective user associated with the respective reference ad-vector based on the multi-condition attention score.
    Type: Grant
    Filed: May 16, 2021
    Date of Patent: October 24, 2023
    Assignee: Google LLC
    Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Yiling Huang, Mert Saglam
  • Patent number: 11734551
    Abstract: A data storage method for speech-related deep neural network (DNN) operations, characterized by comprising the following steps: 1. determining the configuration parameters by a user; 2. configuring a peripheral storage access interface; 3. configuring a multi-transmitting interface of feature storage array; 4. enabling CPU to store to-be-calculated data in a storage space between the feature storage space start address and the feature storage space end address of the peripheral storage device; 5. after data storage, enabling CPU to check the state of the peripheral storage access interface and the multi-transmitting interface of feature storage array; 6. upon receiving a transportation completion signal of the peripheral storage access interface by CPU, enabling the multi-transmitting interface of feature storage array. 7. upon receiving a transportation completion signal of the multi-transmitting interface of feature storage array by CPU, repeating step 6.
    Type: Grant
    Filed: December 10, 2021
    Date of Patent: August 22, 2023
    Assignee: CHIPINTELLI TECHNOLOGY CO., LTD
    Inventors: Zhaoqiang Qiu, Lai Zhang, Fujun Wang, Wei Tian, Yingbin Yang, Yangyang Pei
  • Patent number: 11670299
    Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.
    Type: Grant
    Filed: May 17, 2021
    Date of Patent: June 6, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Ming Sun, Thibaud Senechai, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
  • Patent number: 11636343
    Abstract: Training a neural network (NN) may include training a NN N, and for S, a version of N to be sparsified (e.g. a copy of N), removing NN elements from S to create a sparsified version of S, and training S using outputs from N (e.g. “distillation”). A boosting or reintroduction phase may follow sparsification: training a NN may include for a trained NN N and S, a sparsified version of N, re-introducing NN elements previously removed from S, and training S using outputs from N. The boosting phase need not use a NN sparsified by “distillation.” Training and sparsification, or training and reintroduction, may be performed iteratively or over repetitions.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: April 25, 2023
    Assignee: Neuralmagic Inc.
    Inventor: Dan Alistarh
  • Patent number: 11620493
    Abstract: Various embodiments are provided for intelligent selection of time series models by one or more processors in a computing system. Time series data may be received from a user, one or more computing devices, sensors, or a combination thereof. One or more optimal time series models may be selected upon using and/or evaluating one or more recurrent neural networks models that are trained or pre-trained using simulated time series data or historical time series data, or a combination thereof for one or more predictive analytical tasks relating to the received time series data.
    Type: Grant
    Filed: October 7, 2019
    Date of Patent: April 4, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Beat Buesser, Bei Chen, Kelsey Dipietro
  • Patent number: 11526751
    Abstract: A device may receive historical data and real-time data associated with a troubleshooting service, identify, using a machine learning model, an optimal resolution based on the historical data and the real-time data, and identify, using a graph analytics model, an optimal path of actions based on the optimal resolution. The machine learning model may be trained to identify one of the set of historical issues associated with the unresolved issue, and identify the optimal resolution based on one of the set of historical resolutions associated with the one of the set of historical issues. The graph analytics model may be trained to generate a set of paths of actions based on the historical data, and identify the optimal path based on respective numbers of actions associated with the set of paths. The device may identify optimal action based on the optimal path and the prior action.
    Type: Grant
    Filed: November 25, 2019
    Date of Patent: December 13, 2022
    Assignee: Verizon Patent and Licensing Inc.
    Inventors: Sumit Singh, Balagangadhara Thilak Adiboina, Adithya Umakanth, Ganesh Narasimman, Sambasiva R Bhatta, Anurag Pant
  • Patent number: 11521592
    Abstract: WaveFlow is a small-footprint generative flow for raw audio, which may be directly trained with maximum likelihood. WaveFlow handles the long-range structure of waveform with a dilated two-dimensional (2D) convolutional architecture, while modeling the local variations using expressive autoregressive functions. WaveFlow may provide a unified view of likelihood-based models for raw audio, including WaveNet and WaveGlow, which may be considered special cases. It generates high-fidelity speech, while synthesizing several orders of magnitude faster than existing systems since it uses only a few sequential steps to generate relatively long waveforms. WaveFlow significantly reduces the likelihood gap that has existed between autoregressive models and flow-based models for efficient synthesis. Its small footprint with 5.91M parameters makes it 15 times smaller than some existing models. WaveFlow can generate 22.05 kHz high-fidelity audio 42.
    Type: Grant
    Filed: August 5, 2020
    Date of Patent: December 6, 2022
    Assignee: Baidu USA LLC
    Inventors: Wei Ping, Kainan Peng, Kexin Zhao, Zhao Song
  • Patent number: 11482241
    Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.
    Type: Grant
    Filed: March 27, 2017
    Date of Patent: October 25, 2022
    Assignee: Nuance Communications, Inc
    Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
  • Patent number: 11462229
    Abstract: This disclosure relates generally to a system and method to identify a plurality of noises or their combination to suppress them and enhancing the deteriorated input signal in a dynamic manner. It identifies noises in the audio signal and categorizing them based on the trained database of noises. A combination of deep neural network (DNN) and artificial Intelligence (AI) helps the system for self-learning to understand and capture noises in the environment and retain the model to reduce noises from the next attempt. The system suppresses unwanted noise coming from the external environment with the help of AI based algorithms, by understanding, differentiating, and enhancing human voice in a live environment. The system will help in the reduction of unwanted noises and enhance the experience of business and public meetings, video conferences, musical events, speech broadcasts etc. that could cause distractions, disturbances and create barriers in the conversation.
    Type: Grant
    Filed: March 6, 2020
    Date of Patent: October 4, 2022
    Assignee: TATA CONSULTANCY SERVICES LIMITED
    Inventors: Robin Tommy, Reshmi Ravindranathan, Navin Infant Raj, Venkatakrishna Akula, Jithin Laiju Ravi, Anita Nanadikar, Anil Kumar Sharma, Pranav Champaklal Shah, Bhasha Prasad Khose
  • Patent number: 11462326
    Abstract: A method and system can be used for disease quantification modeling of an anatomical tree structure. The method may include obtaining a centerline of an anatomical tree structure and generating a graph neural network including a plurality of nodes based on a graph. Each node corresponds to a centerline point and edges are defined by the centerline, with an input of each node being a disease related feature or an image patch for the corresponding centerline point and an output of each node being a disease quantification parameter. The method also includes obtaining labeled data of one or more nodes, the number of which is less than a total number of the nodes in the graph neural network. Further, the method includes training the graph neural network by transferring information between the one or more nodes and other nodes based on the labeled data of the one or more nodes.
    Type: Grant
    Filed: June 19, 2020
    Date of Patent: October 4, 2022
    Assignee: KEYA MEDICAL TECHNOLOGY CO., LTD.
    Inventors: Xin Wang, Youbing Yin, Junjie Bai, Qi Song, Kunlin Cao, Yi Lu, Feng Gao
  • Patent number: 11410674
    Abstract: The present application relates to a method and device for recognizing the state of a human body meridian by utilizing a voice recognition technology, the method comprising: receiving an input voice of a user; preprocessing the input voice; extracting a stable feature of the preprocessed input voice; primarily classifying the stable feature on the basis of a feature recognition model, and determining a basic classification pitch, wherein the basic classification pitch comprises Gong, Shang, Jue, Zhi and Yu (respectively equivalent to do, re, mi, sol and la); secondarily classifying the stable feature on the basis of the feature recognition model, and determining a secondary classification tone in the basic classification pitch; and recognizing the state of a meridian according to the secondary classification tone.
    Type: Grant
    Filed: October 23, 2019
    Date of Patent: August 9, 2022
    Inventor: Zhonghua Ci
  • Patent number: 11380114
    Abstract: A method of detecting a target includes generating an image pyramid based on an image on which a detection is to be performed; classifying candidate areas in the image pyramid using a cascade neural network; and determining a target area corresponding to a target included in the image based on the plurality of candidate areas, wherein the cascade neural network includes a plurality of neural networks, and at least one neural network among the neural networks includes parallel sub-neural networks.
    Type: Grant
    Filed: April 15, 2020
    Date of Patent: July 5, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Biao Wang, Chao Zhang, Changkyu Choi, Deheng Qian, Jae-Joon Han, Jingtao Xu, Hao Feng
  • Patent number: 11354542
    Abstract: A mechanism is described for facilitating on-the-fly deep learning in machine learning for autonomous machines. A method of embodiments, as described herein, includes detecting an output associated with a first deep network serving as a user-independent model associated with learning of one or more neural networks at a computing device having a processor coupled to memory. The method may further include automatically generating training data for a second deep network serving as a user-dependent model, where the training data is generated based on the output. The method may further include merging the user-independent model with the user-dependent model into a single joint model.
    Type: Grant
    Filed: February 6, 2020
    Date of Patent: June 7, 2022
    Assignee: Intel Corporation
    Inventor: Raanan Yonatan Yehezkel Rohekar
  • Patent number: 11348572
    Abstract: A speech recognition method includes obtaining an acoustic sequence divided into a plurality of frames, and determining pronunciations in the acoustic sequence by predicting a duration of a same pronunciation in the acoustic sequence and skipping a pronunciation prediction for a frame corresponding to the duration.
    Type: Grant
    Filed: July 18, 2018
    Date of Patent: May 31, 2022
    Assignees: Samsung Electronics Co., Ltd., UNIVERSITE DE MONTREAL
    Inventors: Inchul Song, Junyoung Chung, Taesup Kim, Sanghyun Yoo
  • Patent number: 11334766
    Abstract: Systems and methods are provided for training object detectors of a neural network model with a mixture of label noise and bounding box noise. According to some embodiments, a learning framework is provided which jointly optimizes object labels, bounding box coordinates, and model parameters by performing alternating noise correction and model training. In some embodiments, to disentangle label noise and bounding box noise, a two-step noise correction method is employed. In some examples, the first step performs class-agnostic bounding box correction by minimizing classifier discrepancy and maximizing region objectness. In some examples, the second step uses dual detection heads for label correction and class-specific bounding box refinement.
    Type: Grant
    Filed: January 31, 2020
    Date of Patent: May 17, 2022
    Assignee: salesforce.com, inc.
    Inventors: Junnan Li, Chu Hong Hoi
  • Patent number: 11321866
    Abstract: A method of controlling audio collection for an image capturing device can include receiving image data from an image capturing device; recognizing one or more objects from the image data; determining a first object having a possibility of generating audio among the one or more objects; and collecting audio from the first object by moving a microphone beamforming direction of the image capturing device to be directed toward the first object in response to a determination that the first object is an object having a possibility of generating audio.
    Type: Grant
    Filed: April 7, 2020
    Date of Patent: May 3, 2022
    Assignee: LG ELECTRONICS INC.
    Inventors: Taehyun Kim, Ji Chan Maeng
  • Patent number: 11295751
    Abstract: An apparatus and a method include receiving an input audio signal to be processed by a multi-band synchronized neural vocoder. The input audio signal is separated into a plurality of frequency bands. A plurality of audio signals corresponding to the plurality of frequency bands is obtained. Each of the audio signals is downsampled, and processed by the multi-band synchronized neural vocoder. An audio output signal is generated.
    Type: Grant
    Filed: September 20, 2019
    Date of Patent: April 5, 2022
    Assignee: TENCENT AMERICA LLC
    Inventors: Chengzhu Yu, Meng Yu, Heng Lu, Dong Yu
  • Patent number: 11182564
    Abstract: Embodiments of this application provide a text recommendation method performed at an electronic device. The method includes: extracting feature content of from the a target text; processing the feature content by using at least two text analysis models to obtain at least two semantic vectors; integrating the at least two semantic vectors into an integrated semantic vector of the target text; selecting, according to the integrated semantic vector and an integrated semantic vector of at least one to-be-recommended text, a recommended text corresponding to the target text from the at least one to-be-recommended text. Because the integrated semantic vector of the target text is obtained based on the at least two text analysis models, the integrated semantic vector has a stronger representing capability. When text recommendation is subsequently performed, an association degree between the recommended text and the target text can be increased, thereby improving recommendation accuracy.
    Type: Grant
    Filed: April 14, 2020
    Date of Patent: November 23, 2021
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Bingfeng Li, Xin Fan, Xiaoqiang Feng, Biao Li
  • Patent number: 11176416
    Abstract: A perception device includes: a first neural network that performs a common process associated with perception of an object and thus outputs results of the common process; a second neural network that receives an output of the first neural network and outputs results of a first perception process of perceiving the characteristics of the object with a first accuracy; and a third neural network that receives the output of the first neural network and intermediate data which is generated by the second neural network in the course of the first perception process and outputs results of a second perception process of perceiving the characteristics of the object with a second accuracy which is higher than the first accuracy.
    Type: Grant
    Filed: April 23, 2018
    Date of Patent: November 16, 2021
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventor: Daisuke Hashimoto
  • Patent number: 11137323
    Abstract: A method and system for detecting anomalies in waveforms in an industrial plant. During a learning stage, one or more training waveforms are received from sensors monitoring a plurality of equipment in the industrial plant. The one or more training waveforms are used to generate a representative waveform and deviations of the one or more training waveforms from the representative waveform are determined. Based on the deviations, groups are created. A model may be associated with each group for building an expected waveform pattern. When test waveforms are received, based on the electrical and physical properties of the test waveforms, each test waveform is classified into one of the groups. Thereafter, each waveform is compared with the expected waveform pattern associated with the group to which the respective test waveform belongs, to detect the anomaly.
    Type: Grant
    Filed: November 12, 2018
    Date of Patent: October 5, 2021
    Assignees: KABUSHIKI KAISHA TOSHIBA, Toshiba Memory Corporation
    Inventors: Sai Prem Kumar Ayyagari, Arun Kumar Kalakanti, Topon Paul, Shigeru Maya, Takeichiro Nishikawa
  • Patent number: 11132990
    Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: September 28, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
  • Patent number: 11120789
    Abstract: The invention discloses a training method and a speech recognition method for a mixed frequency acoustic recognition model, which belongs to the technical field of speech recognition.
    Type: Grant
    Filed: January 26, 2018
    Date of Patent: September 14, 2021
    Assignee: YUTOU TECHNOLOGY (HANGZHOU) CO., LTD.
    Inventor: Lichun Fan
  • Patent number: 11120304
    Abstract: A mechanism is described for facilitating the transfer of features learned by a context independent pre-trained deep neural network to a context dependent neural network. The mechanism includes extracting a feature learned by a first deep neural network (DNN) model via the framework, wherein the first DNN model is a pre-trained DNN model for computer vision to enable context-independent classification of an object within an input video frame and training, via the deep learning framework, a second DNN model for computer vision based on the extracted feature, the second DNN model an update of the first DNN model, wherein training the second DNN model includes training the second DNN model based on a dataset including context-dependent data.
    Type: Grant
    Filed: July 15, 2020
    Date of Patent: September 14, 2021
    Assignee: Intel Corporation
    Inventor: Raanan Yonatan Yehezkel Rohekar
  • Patent number: 11043218
    Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: June 22, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
  • Patent number: 11030998
    Abstract: An acoustic model training method, a speech recognition method, an apparatus, a device and a medium. The acoustic model training method comprises: performing feature extraction on a training speech signal to obtain an audio feature sequence; training the audio feature sequence by a phoneme mixed Gaussian Model-Hidden Markov Model to obtain a phoneme feature sequence; and training the phoneme feature sequence by a Deep Neural Net-Hidden Markov Model-sequence training model to obtain a target acoustic model. The acoustic model training method can effectively save time required for an acoustic model training, improve the training efficiency, and ensure the recognition efficiency.
    Type: Grant
    Filed: August 31, 2017
    Date of Patent: June 8, 2021
    Assignee: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.
    Inventors: Hao Liang, Jianzong Wang, Ning Cheng, Jing Xiao
  • Patent number: 11017788
    Abstract: A method of building a new voice having a new timbre using a timbre vector space includes receiving timbre data filtered using a temporal receptive field. The timbre data is mapped in the timbre vector space. The timbre data is related to a plurality of different voices. Each of the plurality of different voices has respective timbre data in the timbre vector space. The method builds the new timbre using the timbre data of the plurality of different voices using a machine learning system.
    Type: Grant
    Filed: April 13, 2020
    Date of Patent: May 25, 2021
    Assignee: Modulate, Inc.
    Inventors: William Carter Huffman, Michael Pappas
  • Patent number: 10991384
    Abstract: A method for automatic affective state inference from speech signals and an automated affective state interference system are disclosed. In an embodiment the method includes capturing speech signals of a target speaker, extracting one or more acoustic voice parameters from the captured speech signals, calibrating voice markers on basis of the one or more acoustic voice parameters that have been extracted from the speech signals of the target speaker, one or more speaker-inherent reference parameters of the target speaker and one or more inter-speaker reference parameters of a sample of reference speakers, applying at least one set of prediction rules that are based on an appraisal criteria to the calibrated voice markers for inferring two or more appraisal criteria scores relating to appraisal of affect-eliciting events with which the target speaker is confronted and assigning one or more affective state terms to the two or more appraisal criteria scores.
    Type: Grant
    Filed: April 20, 2018
    Date of Patent: April 27, 2021
    Assignee: audEERING GMBH
    Inventors: Florian Eyben, Klaus R. Scherer, Björn W. Schuller
  • Patent number: 10909996
    Abstract: An autocorrelation calculation unit 21 calculates an autocorrelation RO(i) from an input signal. A prediction coefficient calculation unit 23 performs linear prediction analysis by using a modified autocorrelation R?O(i) obtained by multiplying a coefficient wO(i) by the autocorrelation RO(i). It is assumed here, for each order i of some orders i at least, that the coefficient wO(i) corresponding to the order i is in a monotonically increasing relationship with an increase in a value that is negatively correlated with a fundamental frequency of the input signal of the current frame or a past frame.
    Type: Grant
    Filed: July 16, 2014
    Date of Patent: February 2, 2021
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Yutaka Kamamoto, Takehiro Moriya, Noboru Harada
  • Patent number: 10867618
    Abstract: Embodiments of the present disclosure provide a speech noise reduction method and a speech noise reduction device based on artificial intelligence and a computer device. The method includes the followings. A first noisy speech to be processed is received. The first noisy speech to be processed is pre-processed, to obtain the first noisy speech in a preset format. The first noisy speech in the preset format is sampled according to a sampling rate indicated by the preset format, to obtain first sampling point information of the first noisy speech. A noise reduction is performed on the first sampling point information through a deep-learning noise reduction model, to generate noise-reduced first sampling point information. A first clean speech is generated according to the noise-reduced first sampling point information.
    Type: Grant
    Filed: December 28, 2017
    Date of Patent: December 15, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Wei Zou, Xiangang Li, Weiwei Cui, Jingyuan Hu
  • Patent number: 10861466
    Abstract: Disclosed are a packet loss concealment method and apparatus a using a generative adversarial network. A method for packet loss concealment in voice communication may include training a classification model based on a generative adversarial network (GAN) with respect to a voice signal including a plurality of frames, training a generative model having a contention relation with the classification model based on the GAN, estimating lost packet information based on the trained generative model with respect to the voice signal encoded by a codec, and restoring a lost packet based on the estimated packet information.
    Type: Grant
    Filed: August 9, 2018
    Date of Patent: December 8, 2020
    Assignee: INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY
    Inventors: Joon-Hyuk Chang, Bong-Ki Lee
  • Patent number: 10825470
    Abstract: The present disclosure provides a method and apparatus for detecting a starting point and a finishing point of a speech, a computer device and a storage medium, wherein the method comprises: obtaining speech data to be detected; segmenting the speech data into speech segments, the number of speech segments being greater than one; respectively determining speech states of respective speech segments based on a Voice Activity Detection model obtained by pre-training; determining a starting point and a finishing point of the speech data according to the speech states. The solution of the present disclosure can be employed to improve the accuracy of the detection results.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: November 3, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Chao Li, Weixin Zhu
  • Patent number: 10796715
    Abstract: Systems and methods use patient speech samples as inputs, use subjective multi-point ratings by speech-language pathologists of multiple perceptual dimensions of patient speech samples as further inputs, and extract laboratory-implemented features from the patient speech samples. A predictive software model learns the relationship between speech acoustics and the subjective ratings of such speech obtained from speech-language pathologists, and is configured to apply this information to evaluate new speech samples. Outputs may include objective evaluation of the plurality of perceptual dimensions for new speech samples and/or evaluation of disease onset, disease progression, or disease treatment efficacy for a condition involving dysarthria as a symptom, utilizing the new speech samples.
    Type: Grant
    Filed: September 1, 2017
    Date of Patent: October 6, 2020
    Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITY
    Inventors: Visar Berisha, Ming Tu, Alan Wisler, Julie Liss
  • Patent number: 10714118
    Abstract: In one embodiment, a method includes accessing a voice signal from a first user; compressing the voice signal using a compression portion of an artificial neural network trained to compress the first user's voice; and sending the compressed voice signal to a second client computing device.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: July 14, 2020
    Assignee: Facebook, Inc.
    Inventor: Pasha Sadri
  • Patent number: 10692513
    Abstract: The invention provides an audio encoder including a combination of a linear predictive coding filter having a plurality of linear predictive coding coefficients and a time-frequency converter, wherein the combination is configured to filter and to convert a frame of the audio signal into a frequency domain in order to output a spectrum based on the frame and on the linear predictive coding coefficients; a low frequency emphasizer configured to calculate a processed spectrum based on the spectrum, wherein spectral lines of the processed spectrum representing a lower frequency than a reference spectral line are emphasized; and a control device configured to control the calculation of the processed spectrum by the low frequency emphasizer depending on the linear predictive coding coefficients of the linear predictive coding filter.
    Type: Grant
    Filed: April 18, 2018
    Date of Patent: June 23, 2020
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Stefan Doehla, Bernhard Grill, Christian Helmrich, Nikolaus Rettelbach
  • Patent number: 10657955
    Abstract: Described herein are systems and methods to identify and address sources of bias in an end-to-end speech model. In one or more embodiments, the end-to-end model may be a recurrent neural network with two 2D-convolutional input layers, followed by multiple bidirectional recurrent layers and one fully connected layer before a softmax layer. In one or more embodiments, the network is trained end-to-end using the CTC loss function to directly predict sequences of characters from log spectrograms of audio. With optimized recurrent layers and training together with alignment information, some unwanted bias induced by using purely forward only recurrences may be removed in a deployed model.
    Type: Grant
    Filed: January 30, 2018
    Date of Patent: May 19, 2020
    Assignee: Baidu USA LLC
    Inventors: Eric Battenberg, Rewon Child, Adam Coates, Christopher Fougner, Yashesh Gaur, Jiaji Huang, Heewoo Jun, Ajay Kannan, Markus Kliegl, Atul Kumar, Hairong Liu, Vinay Rao, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu
  • Patent number: 10490198
    Abstract: A sensor device may include a computing device in communication with multiple microphones. A neural network executing on the computing device may receive audio signals from each microphone. One microphone signal may serve as a reference signal. The neural network may extract differences in signal characteristics of the other microphone signals as compared to the reference signal. The neural network may combine these signal differences into a lossy compressed signal. The sensor device may transmit the lossy compressed signal and the lossless reference signal to a remote neural network executing in a cloud computing environment for decompression and sound recognition analysis.
    Type: Grant
    Filed: December 18, 2017
    Date of Patent: November 26, 2019
    Assignee: GOOGLE LLC
    Inventors: Chanwoo Kim, Rajeev Conrad Nongpiur, Tara Sainath
  • Patent number: 10453074
    Abstract: A user may respond to a request of another user by entering text, such as a customer service representative responding to a customer. Suggestions of resources may be provided to the responding user to assist the responding user in providing a response. For example, a resource may provide information to the responding user or allow the responding user to perform an action. Previous messages between the two users and other information may be used to select a resource. A conversation feature vector may be determined from previous messages, and feature vectors may be determined from the resources. The conversation feature vector and the feature vectors determined from the resource may be used to select a resource to suggest to the responding user.
    Type: Grant
    Filed: September 1, 2016
    Date of Patent: October 22, 2019
    Assignee: ASAPP, INC.
    Inventors: Gustavo Sapoznik, Shawn Henry
  • Patent number: 10381017
    Abstract: The present disclosure provides a method and a device for eliminating background sound, and a terminal device. The method includes: obtaining an initial audio data set; performing background sound fusion processing on the initial audio data set to obtain training sample data; performing neural network training based on the training sample data and the initial audio data set to generate an initial neural network model for eliminating background sound; and performing background sound elimination on audio data to be processed based on the initial neural network model for eliminating background sound.
    Type: Grant
    Filed: July 25, 2018
    Date of Patent: August 13, 2019
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Xuewei Zhang, Xiangang Li
  • Patent number: 10346859
    Abstract: A method is disclosed for enabling a network location to provide an ordering process for data relevant to connected network devices' activities. The method includes assembling the data, utilizing the activity data, and associating the data, such that information is derived to enable a desired expansion of at least one designated activity. Another method is disclosed for managing an object assignment broadcast operations for a network location based on a network device's previous activities. This second method includes tracing a network device's conduct to determine that a network device prefers a particular class of content. The method also includes tagging a network device's profile with the respective observation and deciding by a network location as to the classification factor for a network device to be targeted for an object assignment broadcast.
    Type: Grant
    Filed: January 2, 2018
    Date of Patent: July 9, 2019
    Assignee: LIVEPERSON, INC.
    Inventors: Haggai Shachar, Shahar Nechmad
  • Patent number: 10269355
    Abstract: According to an embodiment, a data processing device generates result data which represents a result of performing predetermined processing on series data. The device includes an upper-level processor and a lower-level processor. The upper-level processor attaches order information to data blocks constituting the series data. The lower-level processor performs lower-level processing on the data blocks having the order information attached thereto, and attaches common order information, which is in common with the data blocks, to values obtained as a result of the lower-level processing. The upper-level processor integrates the values, which have the common order information attached thereto, based on the common order information and performs upper-level processing to generate the result data.
    Type: Grant
    Filed: March 15, 2016
    Date of Patent: April 23, 2019
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Shoko Miyamori, Takashi Masuko, Mitsuyoshi Tachimori, Kouji Ueno, Manabu Nagao
  • Patent number: 10237649
    Abstract: Methods and apparatus relating to microphone devices and signal processing techniques are provided. In an example, a microphone device can detect sound, as well as enhance an ability to perceive at least a general direction from which the sound arrives at the microphone device. In an example, a case of the microphone device has an external surface which at least partially defines funnel-shaped surfaces. Each funnel-shaped surface is configured to direct the sound to a respective microphone diaphragm to produce an auralized multi-microphone output. The funnel-shaped surfaces are configured to cause direction-dependent variations in spectral notches and frequency response of the sound as received by the microphone diaphragms. A neural network can device-shape the auralized multi-microphone output to create a binaural output. The binaural output can be auralized with respect to a human listener.
    Type: Grant
    Filed: December 18, 2017
    Date of Patent: March 19, 2019
    Assignee: Google LLC
    Inventor: Rajeev Conrad Nongpiur
  • Patent number: 10217365
    Abstract: The present disclosure relates to a method for determining whether an object is within a target area, a parking management device, a parking management system and an electronic device. The method comprises the following steps: acquiring an intensity value of a wireless signal that the object receives from a signal transmitting apparatus which is provided on the site of the target area; and determining whether the object is within the target area based on the intensity value of the wireless signal. The method for determining whether the object is within the target area can be applied to guide a user to park a vehicle within a parking area.
    Type: Grant
    Filed: October 27, 2017
    Date of Patent: February 26, 2019
    Assignee: BEIJING MOBIKE TECHNOLOGY CO., LTD.
    Inventors: Chaochao Chen, Zirong Guo, Yujie Yang
  • Patent number: 10194203
    Abstract: A multimodal and real-time method for filtering sensitive content, receiving as input a digital video stream, the method including segmenting digital video into video fragments along the video timeline; extracting features containing significant information from the digital video input on sensitive media; reducing the semantic difference between each of the low-level video features, and the high-level sensitive concept; classifying the video fragments, generating a high-level label (positive or negative), with a confidence score for each fragment representation; performing high-level fusion to properly match the possible high-level labels and confidence scores for each fragment; and predicting the sensitive time by combining the labels of the fragments along the video timeline, indicating the moments when the content becomes sensitive.
    Type: Grant
    Filed: June 30, 2016
    Date of Patent: January 29, 2019
    Assignees: SAMSUNG ELETRÔNICA DA AMACÔNIA LTDA., UNIVERSIDADE ESTADUAL DE CAMPINAS
    Inventors: Sandra Avila, Daniel Moreira, Mauricio Perez, Daniel Moraes, Vanessa Testoni, Siome Goldenstein, Eduardo Valle, Anderson Rocha
  • Patent number: 10095768
    Abstract: The disclosed computer-implemented method for aggregating information-asset classifications may include (1) identifying a data collection that includes two or more information assets, (2) identifying a classification for each of the information assets, (3) deriving, based at least in part on the classifications of the information assets, an aggregate classification for the data collection, and (4) associating the aggregate classification with the data collection to enable a data management system to enforce a data management policy based on the aggregate classification. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Grant
    Filed: November 14, 2014
    Date of Patent: October 9, 2018
    Assignee: Veritas Technologies LLC
    Inventor: Robert Koeten
  • Patent number: 10032461
    Abstract: An apparatus includes microphone receivers configured to receive microphone signals from a plurality of microphones. A comparator configured to determine a speech similarity indication indicative of a similarity between the microphone signal and non-reverberant speech for each microphone signal. The determination is in response to a comparison of a property derived from the microphone signal to a reference property for non-reverberant speech. In some embodiments, the comparator is configured to determine the similarity indication by comparing to reference properties for speech samples of a set of non-reverberant speech samples. A generator is configured to generate a speech signal by combining the microphone signals in response to the similarity indications. The apparatus may be distributed over a plurality of devices each containing a microphone, and the approach may determine the most suited microphone for generating the speech signal.
    Type: Grant
    Filed: February 18, 2014
    Date of Patent: July 24, 2018
    Assignee: KONINKLIJKE PHILIPS N.V.
    Inventor: Sriram Srinivasan
  • Patent number: 10026396
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving a sequence representing an utterance, the sequence comprising a plurality of audio frames; determining one or more warping factors for each audio frame in the sequence using a warping neural network; applying, for each audio frame, the one or more warping factors for the audio frame to the audio frame to generate a respective modified audio frame, wherein the applying comprises using at least one of the warping factors to scale a respective frequency of the audio frame to a new respective frequency in the respective modified audio frame; and decoding the modified audio frames using a decoding neural network, wherein the decoding neural network is configured to output a word sequence that is a transcription of the utterance.
    Type: Grant
    Filed: July 27, 2016
    Date of Patent: July 17, 2018
    Assignee: Google LLC
    Inventor: Andrew W. Senior
  • Patent number: 9953638
    Abstract: A computer-implemented method is described for front end speech processing for automatic speech recognition. A sequence of speech features which characterize an unknown speech input provided on an audio input channel and associated meta-data which characterize the audio input channel are received. The speech features are transformed with a computer process that uses a trained mapping function controlled by the meta-data, and automatic speech recognition is performed of the transformed speech features.
    Type: Grant
    Filed: June 28, 2012
    Date of Patent: April 24, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Daniel Willett, Karl Jonas Lööf, Yue Pan, Joel Pinto, Christian Gollan
  • Patent number: 9875747
    Abstract: A sensor device may include a computing device in communication with multiple microphones. A neural network executing on the computing device may receive audio signals from each microphone. One microphone signal may serve as a reference signal. The neural network may extract differences in signal characteristics of the other microphone signals as compared to the reference signal. The neural network may combine these signal differences into a lossy compressed signal. The sensor device may transmit the lossy compressed signal and the lossless reference signal to a remote neural network executing in a cloud computing environment for decompression and sound recognition analysis.
    Type: Grant
    Filed: July 15, 2016
    Date of Patent: January 23, 2018
    Assignee: GOOGLE LLC
    Inventors: Chanwoo Kim, Rajeev Conrad Nongpiur, Tara Sainath
  • Patent number: 9785313
    Abstract: A method and system for providing a distraction free reading mode with an electronic personal display is disclosed. One example accesses non-adjustable settings for a reader mode. In addition, user adjustable settings for the reader mode on the electronic personal display are also accessed. The user adjustable settings and the non-adjustable settings are then implemented when the reader mode is initiated.
    Type: Grant
    Filed: June 28, 2013
    Date of Patent: October 10, 2017
    Assignee: RAKUTEN KOBO, INC.
    Inventors: James Wu, Peter James Farmer, Michael Serbinis, Pamela Lynn Hilborn