Neural Network Patents (Class 704/202)
-
Patent number: 12094453Abstract: A computer-implemented method of training a streaming speech recognition model that includes receiving, as input to the streaming speech recognition model, a sequence of acoustic frames. The streaming speech recognition model is configured to learn an alignment probability between the sequence of acoustic frames and an output sequence of vocabulary tokens. The vocabulary tokens include a plurality of label tokens and a blank token. At each output step, the method includes determining a first probability of emitting one of the label tokens and determining a second probability of emitting the blank token. The method also includes generating the alignment probability at a sequence level based on the first probability and the second probability. The method also includes applying a tuning parameter to the alignment probability at the sequence level to maximize the first probability of emitting one of the label tokens.Type: GrantFiled: September 9, 2021Date of Patent: September 17, 2024Assignee: Google LLCInventors: Jiahui Yu, Chung-cheng Chiu, Bo Li, Shuo-yiin Chang, Tara Sainath, Wei Han, Anmol Gulati, Yanzhang He, Arun Narayanan, Yonghui Wu, Ruoming Pang
-
Patent number: 12073306Abstract: Systems and methods are disclosed for a centrosymmetric convolutional neural network (CSCNN), an algorithm/hardware co-design framework for CNN compression and acceleration that mitigates the effects of computational irregularity and effectively exploits computational reuse and sparsity for increased performance and energy efficiency.Type: GrantFiled: December 15, 2021Date of Patent: August 27, 2024Assignee: THE GEORGE WASHINGTON UNIVERSITYInventors: Jiajun Li, Ahmed Louri
-
Patent number: 12067989Abstract: Presented are a combined learning method and device using a transformed loss function and feature enhancement based on a deep neural network for speaker recognition that is robust in a noisy environment. A combined learning method using a transformed loss function and feature enhancement based on a deep neural network, according to one embodiment, can comprise the steps of: learning a feature enhancement model based on a deep neural network; learning a speaker feature vector extraction model based on the deep neural network; connecting an output layer of the feature enhancement model with an input layer of the speaker feature vector extraction model; and considering the connected feature enhancement model and speaker feature vector extraction model as one mode and performing combined learning for additional learning.Type: GrantFiled: March 30, 2020Date of Patent: August 20, 2024Assignee: IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY)Inventors: Joon-Hyuk Chang, Joonyoung Yang
-
Patent number: 12056604Abstract: Layers of a deep neural network (DNN) are partitioned into stages using a profile of the DNN. Each of the stages includes one or more of the layers of the DNN. The partitioning of the layers of the DNN into stages is optimized in various ways including optimizing the partitioning to minimize training time, to minimize data communication between worker computing devices used to train the DNN, or to ensure that the worker computing devices perform an approximately equal amount of the processing for training the DNN. The stages are assigned to the worker computing devices. The worker computing devices process batches of training data using a scheduling policy that causes the workers to alternate between forward processing of the batches of the DNN training data and backward processing of the batches of the DNN training data. The stages can be configured for model parallel processing or data parallel processing.Type: GrantFiled: June 29, 2018Date of Patent: August 6, 2024Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Vivek Seshadri, Amar Phanishayee, Deepak Narayanan, Aaron Harlap, Nikhil Devanur Rangarajan
-
Patent number: 12032653Abstract: An apparatus and method are described for distributed and cooperative computation in artificial neural networks. For example, one embodiment of an apparatus comprises: an input/output (I/O) interface; a plurality of processing units communicatively coupled to the I/O interface to receive data for input neurons and synaptic weights associated with each of the input neurons, each of the plurality of processing units to process at least a portion of the data for the input neurons and synaptic weights to generate partial results; and an interconnect communicatively coupling the plurality of processing units, each of the processing units to share the partial results with one or more other processing units over the interconnect, the other processing units using the partial results to generate additional partial results or final results. The processing units may share data including input neurons and weights over the shared input bus.Type: GrantFiled: May 3, 2021Date of Patent: July 9, 2024Assignee: Intel CorporationInventors: Frederico C. Pratas, Ayose J. Falcon, Marc Lupon, Fernando Latorre, Pedro Lopez, Enric Herrero Abellanas, Georgios Tournavitis
-
Patent number: 11990057Abstract: Briefly, example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to facilitate and/or support one or more operations and/or techniques for electronic infrastructure for digital content delivery and/or online assessment management, such as implemented, at least in part, via one or more computing and/or communication networks and/or protocols.Type: GrantFiled: February 14, 2020Date of Patent: May 21, 2024Assignee: ARH TECHNOLOGIES, LLCInventors: Alan R. Hollander, Micky McCuen
-
Patent number: 11809992Abstract: Neural networks with similar architectures may be compressed using shared compression profiles. A request to compress a trained neural network may be received and an architecture of the neural network identified. The identified architecture may be compared with the different network architectures mapped to compression profiles to select a compression profile for the neural network. The compression profile may be applied to remove features of the neural network to generate a compressed version of the neural network.Type: GrantFiled: March 31, 2020Date of Patent: November 7, 2023Assignee: Amazon Technologies, Inc.Inventors: Gurumurthy Swaminathan, Ragav Venkatesan, Xiong Zhou, Runfei Luo, Vineet Khare
-
Patent number: 11798562Abstract: A speaker verification method includes receiving audio data corresponding to an utterance, processing the audio data to generate a reference attentive d-vector representing voice characteristics of the utterance, the evaluation ad-vector includes ne style classes each including a respective value vector concatenated with a corresponding routing vector. The method also includes generating using a self-attention mechanism, at least one multi-condition attention score that indicates a likelihood that the evaluation ad-vector matches a respective reference ad-vector associated with a respective user. The method also includes identifying the speaker of the utterance as the respective user associated with the respective reference ad-vector based on the multi-condition attention score.Type: GrantFiled: May 16, 2021Date of Patent: October 24, 2023Assignee: Google LLCInventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Yiling Huang, Mert Saglam
-
Patent number: 11734551Abstract: A data storage method for speech-related deep neural network (DNN) operations, characterized by comprising the following steps: 1. determining the configuration parameters by a user; 2. configuring a peripheral storage access interface; 3. configuring a multi-transmitting interface of feature storage array; 4. enabling CPU to store to-be-calculated data in a storage space between the feature storage space start address and the feature storage space end address of the peripheral storage device; 5. after data storage, enabling CPU to check the state of the peripheral storage access interface and the multi-transmitting interface of feature storage array; 6. upon receiving a transportation completion signal of the peripheral storage access interface by CPU, enabling the multi-transmitting interface of feature storage array. 7. upon receiving a transportation completion signal of the multi-transmitting interface of feature storage array by CPU, repeating step 6.Type: GrantFiled: December 10, 2021Date of Patent: August 22, 2023Assignee: CHIPINTELLI TECHNOLOGY CO., LTDInventors: Zhaoqiang Qiu, Lai Zhang, Fujun Wang, Wei Tian, Yingbin Yang, Yangyang Pei
-
Patent number: 11670299Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.Type: GrantFiled: May 17, 2021Date of Patent: June 6, 2023Assignee: Amazon Technologies, Inc.Inventors: Ming Sun, Thibaud Senechai, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
-
Patent number: 11636343Abstract: Training a neural network (NN) may include training a NN N, and for S, a version of N to be sparsified (e.g. a copy of N), removing NN elements from S to create a sparsified version of S, and training S using outputs from N (e.g. “distillation”). A boosting or reintroduction phase may follow sparsification: training a NN may include for a trained NN N and S, a sparsified version of N, re-introducing NN elements previously removed from S, and training S using outputs from N. The boosting phase need not use a NN sparsified by “distillation.” Training and sparsification, or training and reintroduction, may be performed iteratively or over repetitions.Type: GrantFiled: September 26, 2019Date of Patent: April 25, 2023Assignee: Neuralmagic Inc.Inventor: Dan Alistarh
-
Patent number: 11620493Abstract: Various embodiments are provided for intelligent selection of time series models by one or more processors in a computing system. Time series data may be received from a user, one or more computing devices, sensors, or a combination thereof. One or more optimal time series models may be selected upon using and/or evaluating one or more recurrent neural networks models that are trained or pre-trained using simulated time series data or historical time series data, or a combination thereof for one or more predictive analytical tasks relating to the received time series data.Type: GrantFiled: October 7, 2019Date of Patent: April 4, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Beat Buesser, Bei Chen, Kelsey Dipietro
-
Patent number: 11526751Abstract: A device may receive historical data and real-time data associated with a troubleshooting service, identify, using a machine learning model, an optimal resolution based on the historical data and the real-time data, and identify, using a graph analytics model, an optimal path of actions based on the optimal resolution. The machine learning model may be trained to identify one of the set of historical issues associated with the unresolved issue, and identify the optimal resolution based on one of the set of historical resolutions associated with the one of the set of historical issues. The graph analytics model may be trained to generate a set of paths of actions based on the historical data, and identify the optimal path based on respective numbers of actions associated with the set of paths. The device may identify optimal action based on the optimal path and the prior action.Type: GrantFiled: November 25, 2019Date of Patent: December 13, 2022Assignee: Verizon Patent and Licensing Inc.Inventors: Sumit Singh, Balagangadhara Thilak Adiboina, Adithya Umakanth, Ganesh Narasimman, Sambasiva R Bhatta, Anurag Pant
-
Patent number: 11521592Abstract: WaveFlow is a small-footprint generative flow for raw audio, which may be directly trained with maximum likelihood. WaveFlow handles the long-range structure of waveform with a dilated two-dimensional (2D) convolutional architecture, while modeling the local variations using expressive autoregressive functions. WaveFlow may provide a unified view of likelihood-based models for raw audio, including WaveNet and WaveGlow, which may be considered special cases. It generates high-fidelity speech, while synthesizing several orders of magnitude faster than existing systems since it uses only a few sequential steps to generate relatively long waveforms. WaveFlow significantly reduces the likelihood gap that has existed between autoregressive models and flow-based models for efficient synthesis. Its small footprint with 5.91M parameters makes it 15 times smaller than some existing models. WaveFlow can generate 22.05 kHz high-fidelity audio 42.Type: GrantFiled: August 5, 2020Date of Patent: December 6, 2022Assignee: Baidu USA LLCInventors: Wei Ping, Kainan Peng, Kexin Zhao, Zhao Song
-
Patent number: 11482241Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.Type: GrantFiled: March 27, 2017Date of Patent: October 25, 2022Assignee: Nuance Communications, IncInventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
-
Patent number: 11462326Abstract: A method and system can be used for disease quantification modeling of an anatomical tree structure. The method may include obtaining a centerline of an anatomical tree structure and generating a graph neural network including a plurality of nodes based on a graph. Each node corresponds to a centerline point and edges are defined by the centerline, with an input of each node being a disease related feature or an image patch for the corresponding centerline point and an output of each node being a disease quantification parameter. The method also includes obtaining labeled data of one or more nodes, the number of which is less than a total number of the nodes in the graph neural network. Further, the method includes training the graph neural network by transferring information between the one or more nodes and other nodes based on the labeled data of the one or more nodes.Type: GrantFiled: June 19, 2020Date of Patent: October 4, 2022Assignee: KEYA MEDICAL TECHNOLOGY CO., LTD.Inventors: Xin Wang, Youbing Yin, Junjie Bai, Qi Song, Kunlin Cao, Yi Lu, Feng Gao
-
Patent number: 11462229Abstract: This disclosure relates generally to a system and method to identify a plurality of noises or their combination to suppress them and enhancing the deteriorated input signal in a dynamic manner. It identifies noises in the audio signal and categorizing them based on the trained database of noises. A combination of deep neural network (DNN) and artificial Intelligence (AI) helps the system for self-learning to understand and capture noises in the environment and retain the model to reduce noises from the next attempt. The system suppresses unwanted noise coming from the external environment with the help of AI based algorithms, by understanding, differentiating, and enhancing human voice in a live environment. The system will help in the reduction of unwanted noises and enhance the experience of business and public meetings, video conferences, musical events, speech broadcasts etc. that could cause distractions, disturbances and create barriers in the conversation.Type: GrantFiled: March 6, 2020Date of Patent: October 4, 2022Assignee: TATA CONSULTANCY SERVICES LIMITEDInventors: Robin Tommy, Reshmi Ravindranathan, Navin Infant Raj, Venkatakrishna Akula, Jithin Laiju Ravi, Anita Nanadikar, Anil Kumar Sharma, Pranav Champaklal Shah, Bhasha Prasad Khose
-
Patent number: 11410674Abstract: The present application relates to a method and device for recognizing the state of a human body meridian by utilizing a voice recognition technology, the method comprising: receiving an input voice of a user; preprocessing the input voice; extracting a stable feature of the preprocessed input voice; primarily classifying the stable feature on the basis of a feature recognition model, and determining a basic classification pitch, wherein the basic classification pitch comprises Gong, Shang, Jue, Zhi and Yu (respectively equivalent to do, re, mi, sol and la); secondarily classifying the stable feature on the basis of the feature recognition model, and determining a secondary classification tone in the basic classification pitch; and recognizing the state of a meridian according to the secondary classification tone.Type: GrantFiled: October 23, 2019Date of Patent: August 9, 2022Inventor: Zhonghua Ci
-
Patent number: 11380114Abstract: A method of detecting a target includes generating an image pyramid based on an image on which a detection is to be performed; classifying candidate areas in the image pyramid using a cascade neural network; and determining a target area corresponding to a target included in the image based on the plurality of candidate areas, wherein the cascade neural network includes a plurality of neural networks, and at least one neural network among the neural networks includes parallel sub-neural networks.Type: GrantFiled: April 15, 2020Date of Patent: July 5, 2022Assignee: Samsung Electronics Co., Ltd.Inventors: Biao Wang, Chao Zhang, Changkyu Choi, Deheng Qian, Jae-Joon Han, Jingtao Xu, Hao Feng
-
Patent number: 11354542Abstract: A mechanism is described for facilitating on-the-fly deep learning in machine learning for autonomous machines. A method of embodiments, as described herein, includes detecting an output associated with a first deep network serving as a user-independent model associated with learning of one or more neural networks at a computing device having a processor coupled to memory. The method may further include automatically generating training data for a second deep network serving as a user-dependent model, where the training data is generated based on the output. The method may further include merging the user-independent model with the user-dependent model into a single joint model.Type: GrantFiled: February 6, 2020Date of Patent: June 7, 2022Assignee: Intel CorporationInventor: Raanan Yonatan Yehezkel Rohekar
-
Patent number: 11348572Abstract: A speech recognition method includes obtaining an acoustic sequence divided into a plurality of frames, and determining pronunciations in the acoustic sequence by predicting a duration of a same pronunciation in the acoustic sequence and skipping a pronunciation prediction for a frame corresponding to the duration.Type: GrantFiled: July 18, 2018Date of Patent: May 31, 2022Assignees: Samsung Electronics Co., Ltd., UNIVERSITE DE MONTREALInventors: Inchul Song, Junyoung Chung, Taesup Kim, Sanghyun Yoo
-
Patent number: 11334766Abstract: Systems and methods are provided for training object detectors of a neural network model with a mixture of label noise and bounding box noise. According to some embodiments, a learning framework is provided which jointly optimizes object labels, bounding box coordinates, and model parameters by performing alternating noise correction and model training. In some embodiments, to disentangle label noise and bounding box noise, a two-step noise correction method is employed. In some examples, the first step performs class-agnostic bounding box correction by minimizing classifier discrepancy and maximizing region objectness. In some examples, the second step uses dual detection heads for label correction and class-specific bounding box refinement.Type: GrantFiled: January 31, 2020Date of Patent: May 17, 2022Assignee: salesforce.com, inc.Inventors: Junnan Li, Chu Hong Hoi
-
Patent number: 11321866Abstract: A method of controlling audio collection for an image capturing device can include receiving image data from an image capturing device; recognizing one or more objects from the image data; determining a first object having a possibility of generating audio among the one or more objects; and collecting audio from the first object by moving a microphone beamforming direction of the image capturing device to be directed toward the first object in response to a determination that the first object is an object having a possibility of generating audio.Type: GrantFiled: April 7, 2020Date of Patent: May 3, 2022Assignee: LG ELECTRONICS INC.Inventors: Taehyun Kim, Ji Chan Maeng
-
Patent number: 11295751Abstract: An apparatus and a method include receiving an input audio signal to be processed by a multi-band synchronized neural vocoder. The input audio signal is separated into a plurality of frequency bands. A plurality of audio signals corresponding to the plurality of frequency bands is obtained. Each of the audio signals is downsampled, and processed by the multi-band synchronized neural vocoder. An audio output signal is generated.Type: GrantFiled: September 20, 2019Date of Patent: April 5, 2022Assignee: TENCENT AMERICA LLCInventors: Chengzhu Yu, Meng Yu, Heng Lu, Dong Yu
-
Patent number: 11182564Abstract: Embodiments of this application provide a text recommendation method performed at an electronic device. The method includes: extracting feature content of from the a target text; processing the feature content by using at least two text analysis models to obtain at least two semantic vectors; integrating the at least two semantic vectors into an integrated semantic vector of the target text; selecting, according to the integrated semantic vector and an integrated semantic vector of at least one to-be-recommended text, a recommended text corresponding to the target text from the at least one to-be-recommended text. Because the integrated semantic vector of the target text is obtained based on the at least two text analysis models, the integrated semantic vector has a stronger representing capability. When text recommendation is subsequently performed, an association degree between the recommended text and the target text can be increased, thereby improving recommendation accuracy.Type: GrantFiled: April 14, 2020Date of Patent: November 23, 2021Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Bingfeng Li, Xin Fan, Xiaoqiang Feng, Biao Li
-
Patent number: 11176416Abstract: A perception device includes: a first neural network that performs a common process associated with perception of an object and thus outputs results of the common process; a second neural network that receives an output of the first neural network and outputs results of a first perception process of perceiving the characteristics of the object with a first accuracy; and a third neural network that receives the output of the first neural network and intermediate data which is generated by the second neural network in the course of the first perception process and outputs results of a second perception process of perceiving the characteristics of the object with a second accuracy which is higher than the first accuracy.Type: GrantFiled: April 23, 2018Date of Patent: November 16, 2021Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHAInventor: Daisuke Hashimoto
-
Patent number: 11137323Abstract: A method and system for detecting anomalies in waveforms in an industrial plant. During a learning stage, one or more training waveforms are received from sensors monitoring a plurality of equipment in the industrial plant. The one or more training waveforms are used to generate a representative waveform and deviations of the one or more training waveforms from the representative waveform are determined. Based on the deviations, groups are created. A model may be associated with each group for building an expected waveform pattern. When test waveforms are received, based on the electrical and physical properties of the test waveforms, each test waveform is classified into one of the groups. Thereafter, each waveform is compared with the expected waveform pattern associated with the group to which the respective test waveform belongs, to detect the anomaly.Type: GrantFiled: November 12, 2018Date of Patent: October 5, 2021Assignees: KABUSHIKI KAISHA TOSHIBA, Toshiba Memory CorporationInventors: Sai Prem Kumar Ayyagari, Arun Kumar Kalakanti, Topon Paul, Shigeru Maya, Takeichiro Nishikawa
-
Patent number: 11132990Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.Type: GrantFiled: June 26, 2019Date of Patent: September 28, 2021Assignee: Amazon Technologies, Inc.Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
-
Patent number: 11120304Abstract: A mechanism is described for facilitating the transfer of features learned by a context independent pre-trained deep neural network to a context dependent neural network. The mechanism includes extracting a feature learned by a first deep neural network (DNN) model via the framework, wherein the first DNN model is a pre-trained DNN model for computer vision to enable context-independent classification of an object within an input video frame and training, via the deep learning framework, a second DNN model for computer vision based on the extracted feature, the second DNN model an update of the first DNN model, wherein training the second DNN model includes training the second DNN model based on a dataset including context-dependent data.Type: GrantFiled: July 15, 2020Date of Patent: September 14, 2021Assignee: Intel CorporationInventor: Raanan Yonatan Yehezkel Rohekar
-
Patent number: 11120789Abstract: The invention discloses a training method and a speech recognition method for a mixed frequency acoustic recognition model, which belongs to the technical field of speech recognition.Type: GrantFiled: January 26, 2018Date of Patent: September 14, 2021Assignee: YUTOU TECHNOLOGY (HANGZHOU) CO., LTD.Inventor: Lichun Fan
-
Patent number: 11043218Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.Type: GrantFiled: June 26, 2019Date of Patent: June 22, 2021Assignee: Amazon Technologies, Inc.Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
-
Patent number: 11030998Abstract: An acoustic model training method, a speech recognition method, an apparatus, a device and a medium. The acoustic model training method comprises: performing feature extraction on a training speech signal to obtain an audio feature sequence; training the audio feature sequence by a phoneme mixed Gaussian Model-Hidden Markov Model to obtain a phoneme feature sequence; and training the phoneme feature sequence by a Deep Neural Net-Hidden Markov Model-sequence training model to obtain a target acoustic model. The acoustic model training method can effectively save time required for an acoustic model training, improve the training efficiency, and ensure the recognition efficiency.Type: GrantFiled: August 31, 2017Date of Patent: June 8, 2021Assignee: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.Inventors: Hao Liang, Jianzong Wang, Ning Cheng, Jing Xiao
-
Patent number: 11017788Abstract: A method of building a new voice having a new timbre using a timbre vector space includes receiving timbre data filtered using a temporal receptive field. The timbre data is mapped in the timbre vector space. The timbre data is related to a plurality of different voices. Each of the plurality of different voices has respective timbre data in the timbre vector space. The method builds the new timbre using the timbre data of the plurality of different voices using a machine learning system.Type: GrantFiled: April 13, 2020Date of Patent: May 25, 2021Assignee: Modulate, Inc.Inventors: William Carter Huffman, Michael Pappas
-
Patent number: 10991384Abstract: A method for automatic affective state inference from speech signals and an automated affective state interference system are disclosed. In an embodiment the method includes capturing speech signals of a target speaker, extracting one or more acoustic voice parameters from the captured speech signals, calibrating voice markers on basis of the one or more acoustic voice parameters that have been extracted from the speech signals of the target speaker, one or more speaker-inherent reference parameters of the target speaker and one or more inter-speaker reference parameters of a sample of reference speakers, applying at least one set of prediction rules that are based on an appraisal criteria to the calibrated voice markers for inferring two or more appraisal criteria scores relating to appraisal of affect-eliciting events with which the target speaker is confronted and assigning one or more affective state terms to the two or more appraisal criteria scores.Type: GrantFiled: April 20, 2018Date of Patent: April 27, 2021Assignee: audEERING GMBHInventors: Florian Eyben, Klaus R. Scherer, Björn W. Schuller
-
Patent number: 10909996Abstract: An autocorrelation calculation unit 21 calculates an autocorrelation RO(i) from an input signal. A prediction coefficient calculation unit 23 performs linear prediction analysis by using a modified autocorrelation R?O(i) obtained by multiplying a coefficient wO(i) by the autocorrelation RO(i). It is assumed here, for each order i of some orders i at least, that the coefficient wO(i) corresponding to the order i is in a monotonically increasing relationship with an increase in a value that is negatively correlated with a fundamental frequency of the input signal of the current frame or a past frame.Type: GrantFiled: July 16, 2014Date of Patent: February 2, 2021Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Yutaka Kamamoto, Takehiro Moriya, Noboru Harada
-
Patent number: 10867618Abstract: Embodiments of the present disclosure provide a speech noise reduction method and a speech noise reduction device based on artificial intelligence and a computer device. The method includes the followings. A first noisy speech to be processed is received. The first noisy speech to be processed is pre-processed, to obtain the first noisy speech in a preset format. The first noisy speech in the preset format is sampled according to a sampling rate indicated by the preset format, to obtain first sampling point information of the first noisy speech. A noise reduction is performed on the first sampling point information through a deep-learning noise reduction model, to generate noise-reduced first sampling point information. A first clean speech is generated according to the noise-reduced first sampling point information.Type: GrantFiled: December 28, 2017Date of Patent: December 15, 2020Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Wei Zou, Xiangang Li, Weiwei Cui, Jingyuan Hu
-
Patent number: 10861466Abstract: Disclosed are a packet loss concealment method and apparatus a using a generative adversarial network. A method for packet loss concealment in voice communication may include training a classification model based on a generative adversarial network (GAN) with respect to a voice signal including a plurality of frames, training a generative model having a contention relation with the classification model based on the GAN, estimating lost packet information based on the trained generative model with respect to the voice signal encoded by a codec, and restoring a lost packet based on the estimated packet information.Type: GrantFiled: August 9, 2018Date of Patent: December 8, 2020Assignee: INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITYInventors: Joon-Hyuk Chang, Bong-Ki Lee
-
Patent number: 10825470Abstract: The present disclosure provides a method and apparatus for detecting a starting point and a finishing point of a speech, a computer device and a storage medium, wherein the method comprises: obtaining speech data to be detected; segmenting the speech data into speech segments, the number of speech segments being greater than one; respectively determining speech states of respective speech segments based on a Voice Activity Detection model obtained by pre-training; determining a starting point and a finishing point of the speech data according to the speech states. The solution of the present disclosure can be employed to improve the accuracy of the detection results.Type: GrantFiled: December 12, 2018Date of Patent: November 3, 2020Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Chao Li, Weixin Zhu
-
Patent number: 10796715Abstract: Systems and methods use patient speech samples as inputs, use subjective multi-point ratings by speech-language pathologists of multiple perceptual dimensions of patient speech samples as further inputs, and extract laboratory-implemented features from the patient speech samples. A predictive software model learns the relationship between speech acoustics and the subjective ratings of such speech obtained from speech-language pathologists, and is configured to apply this information to evaluate new speech samples. Outputs may include objective evaluation of the plurality of perceptual dimensions for new speech samples and/or evaluation of disease onset, disease progression, or disease treatment efficacy for a condition involving dysarthria as a symptom, utilizing the new speech samples.Type: GrantFiled: September 1, 2017Date of Patent: October 6, 2020Assignee: ARIZONA BOARD OF REGENTS ON BEHALF OF ARIZONA STATE UNIVERSITYInventors: Visar Berisha, Ming Tu, Alan Wisler, Julie Liss
-
Patent number: 10714118Abstract: In one embodiment, a method includes accessing a voice signal from a first user; compressing the voice signal using a compression portion of an artificial neural network trained to compress the first user's voice; and sending the compressed voice signal to a second client computing device.Type: GrantFiled: December 30, 2016Date of Patent: July 14, 2020Assignee: Facebook, Inc.Inventor: Pasha Sadri
-
Patent number: 10692513Abstract: The invention provides an audio encoder including a combination of a linear predictive coding filter having a plurality of linear predictive coding coefficients and a time-frequency converter, wherein the combination is configured to filter and to convert a frame of the audio signal into a frequency domain in order to output a spectrum based on the frame and on the linear predictive coding coefficients; a low frequency emphasizer configured to calculate a processed spectrum based on the spectrum, wherein spectral lines of the processed spectrum representing a lower frequency than a reference spectral line are emphasized; and a control device configured to control the calculation of the processed spectrum by the low frequency emphasizer depending on the linear predictive coding coefficients of the linear predictive coding filter.Type: GrantFiled: April 18, 2018Date of Patent: June 23, 2020Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Stefan Doehla, Bernhard Grill, Christian Helmrich, Nikolaus Rettelbach
-
Patent number: 10657955Abstract: Described herein are systems and methods to identify and address sources of bias in an end-to-end speech model. In one or more embodiments, the end-to-end model may be a recurrent neural network with two 2D-convolutional input layers, followed by multiple bidirectional recurrent layers and one fully connected layer before a softmax layer. In one or more embodiments, the network is trained end-to-end using the CTC loss function to directly predict sequences of characters from log spectrograms of audio. With optimized recurrent layers and training together with alignment information, some unwanted bias induced by using purely forward only recurrences may be removed in a deployed model.Type: GrantFiled: January 30, 2018Date of Patent: May 19, 2020Assignee: Baidu USA LLCInventors: Eric Battenberg, Rewon Child, Adam Coates, Christopher Fougner, Yashesh Gaur, Jiaji Huang, Heewoo Jun, Ajay Kannan, Markus Kliegl, Atul Kumar, Hairong Liu, Vinay Rao, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu
-
Patent number: 10490198Abstract: A sensor device may include a computing device in communication with multiple microphones. A neural network executing on the computing device may receive audio signals from each microphone. One microphone signal may serve as a reference signal. The neural network may extract differences in signal characteristics of the other microphone signals as compared to the reference signal. The neural network may combine these signal differences into a lossy compressed signal. The sensor device may transmit the lossy compressed signal and the lossless reference signal to a remote neural network executing in a cloud computing environment for decompression and sound recognition analysis.Type: GrantFiled: December 18, 2017Date of Patent: November 26, 2019Assignee: GOOGLE LLCInventors: Chanwoo Kim, Rajeev Conrad Nongpiur, Tara Sainath
-
Patent number: 10453074Abstract: A user may respond to a request of another user by entering text, such as a customer service representative responding to a customer. Suggestions of resources may be provided to the responding user to assist the responding user in providing a response. For example, a resource may provide information to the responding user or allow the responding user to perform an action. Previous messages between the two users and other information may be used to select a resource. A conversation feature vector may be determined from previous messages, and feature vectors may be determined from the resources. The conversation feature vector and the feature vectors determined from the resource may be used to select a resource to suggest to the responding user.Type: GrantFiled: September 1, 2016Date of Patent: October 22, 2019Assignee: ASAPP, INC.Inventors: Gustavo Sapoznik, Shawn Henry
-
Patent number: 10381017Abstract: The present disclosure provides a method and a device for eliminating background sound, and a terminal device. The method includes: obtaining an initial audio data set; performing background sound fusion processing on the initial audio data set to obtain training sample data; performing neural network training based on the training sample data and the initial audio data set to generate an initial neural network model for eliminating background sound; and performing background sound elimination on audio data to be processed based on the initial neural network model for eliminating background sound.Type: GrantFiled: July 25, 2018Date of Patent: August 13, 2019Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Xuewei Zhang, Xiangang Li
-
Patent number: 10346859Abstract: A method is disclosed for enabling a network location to provide an ordering process for data relevant to connected network devices' activities. The method includes assembling the data, utilizing the activity data, and associating the data, such that information is derived to enable a desired expansion of at least one designated activity. Another method is disclosed for managing an object assignment broadcast operations for a network location based on a network device's previous activities. This second method includes tracing a network device's conduct to determine that a network device prefers a particular class of content. The method also includes tagging a network device's profile with the respective observation and deciding by a network location as to the classification factor for a network device to be targeted for an object assignment broadcast.Type: GrantFiled: January 2, 2018Date of Patent: July 9, 2019Assignee: LIVEPERSON, INC.Inventors: Haggai Shachar, Shahar Nechmad
-
Patent number: 10269355Abstract: According to an embodiment, a data processing device generates result data which represents a result of performing predetermined processing on series data. The device includes an upper-level processor and a lower-level processor. The upper-level processor attaches order information to data blocks constituting the series data. The lower-level processor performs lower-level processing on the data blocks having the order information attached thereto, and attaches common order information, which is in common with the data blocks, to values obtained as a result of the lower-level processing. The upper-level processor integrates the values, which have the common order information attached thereto, based on the common order information and performs upper-level processing to generate the result data.Type: GrantFiled: March 15, 2016Date of Patent: April 23, 2019Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Shoko Miyamori, Takashi Masuko, Mitsuyoshi Tachimori, Kouji Ueno, Manabu Nagao
-
Patent number: 10237649Abstract: Methods and apparatus relating to microphone devices and signal processing techniques are provided. In an example, a microphone device can detect sound, as well as enhance an ability to perceive at least a general direction from which the sound arrives at the microphone device. In an example, a case of the microphone device has an external surface which at least partially defines funnel-shaped surfaces. Each funnel-shaped surface is configured to direct the sound to a respective microphone diaphragm to produce an auralized multi-microphone output. The funnel-shaped surfaces are configured to cause direction-dependent variations in spectral notches and frequency response of the sound as received by the microphone diaphragms. A neural network can device-shape the auralized multi-microphone output to create a binaural output. The binaural output can be auralized with respect to a human listener.Type: GrantFiled: December 18, 2017Date of Patent: March 19, 2019Assignee: Google LLCInventor: Rajeev Conrad Nongpiur
-
Patent number: 10217365Abstract: The present disclosure relates to a method for determining whether an object is within a target area, a parking management device, a parking management system and an electronic device. The method comprises the following steps: acquiring an intensity value of a wireless signal that the object receives from a signal transmitting apparatus which is provided on the site of the target area; and determining whether the object is within the target area based on the intensity value of the wireless signal. The method for determining whether the object is within the target area can be applied to guide a user to park a vehicle within a parking area.Type: GrantFiled: October 27, 2017Date of Patent: February 26, 2019Assignee: BEIJING MOBIKE TECHNOLOGY CO., LTD.Inventors: Chaochao Chen, Zirong Guo, Yujie Yang
-
Patent number: 10194203Abstract: A multimodal and real-time method for filtering sensitive content, receiving as input a digital video stream, the method including segmenting digital video into video fragments along the video timeline; extracting features containing significant information from the digital video input on sensitive media; reducing the semantic difference between each of the low-level video features, and the high-level sensitive concept; classifying the video fragments, generating a high-level label (positive or negative), with a confidence score for each fragment representation; performing high-level fusion to properly match the possible high-level labels and confidence scores for each fragment; and predicting the sensitive time by combining the labels of the fragments along the video timeline, indicating the moments when the content becomes sensitive.Type: GrantFiled: June 30, 2016Date of Patent: January 29, 2019Assignees: SAMSUNG ELETRÔNICA DA AMACÔNIA LTDA., UNIVERSIDADE ESTADUAL DE CAMPINASInventors: Sandra Avila, Daniel Moreira, Mauricio Perez, Daniel Moraes, Vanessa Testoni, Siome Goldenstein, Eduardo Valle, Anderson Rocha