Neural Network Patents (Class 704/232)
-
Patent number: 11625769Abstract: A compliance determination and enforcement platform is described. A plurality of factors are stored in association with each of a plurality of accounts. A factor entering module enters factors from each user account into a compliance score model. The compliance score model determines a compliance score for each one of the accounts based on the respective factors associated with the respective account. A comparator compares the compliance score for each account with a compliance reference score to determine a subset of the accounts that fail compliance and a subset of the accounts that meet compliance. A flagging unit flags the user accounts that fail compliance to indicate non-compliant accounts. A corrective action system allows for determining, for each one of the accounts that is flagged as non-compliant, whether the account is bad or good, entering the determination into a feedback system and closing the account.Type: GrantFiled: September 21, 2016Date of Patent: April 11, 2023Assignee: Coinbase, Inc.Inventors: Bradley J. Larson, Linda Xie, Paul Jabaay, Jeffrey B. Kern
-
Patent number: 11625572Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from a source sequence. In one aspect, the system includes a recurrent neural network configured to, at each time step, receive an input for the time step and process the input to generate a progress score and a set of output scores; and a subsystem configured to, at each time step, generate the recurrent neural network input and provide the input to the recurrent neural network; determine, from the progress score, whether or not to emit a new output at the time step; and, in response to determining to emit a new output, select an output using the output scores and emit the selected output as the output at a next position in the output order.Type: GrantFiled: May 3, 2018Date of Patent: April 11, 2023Assignee: Google LLCInventors: Chung-Cheng Chiu, Navdeep Jaitly, John Dieterich Lawson, George Jay Tucker
-
Patent number: 11610579Abstract: Determining slot value(s) based on received natural language input and based on descriptor(s) for the slot(s). In some implementations, natural language input is received as part of human-to-automated assistant dialog. A natural language input embedding is generated based on token(s) of the natural language input. Further, descriptor embedding(s) are generated (or received), where each of the descriptor embeddings is generated based on descriptor(s) for a corresponding slot that is assigned to a domain indicated by the dialog. The natural language input embedding and the descriptor embedding(s) are applied to layer(s) of a neural network model to determine, for each of the slot(s), which token(s) of the natural language input correspond to the slot. A command is generated that includes slot value(s) for slot(s), where the slot value(s) for one or more of slot(s) are determined based on the token(s) determined to correspond to the slot(s).Type: GrantFiled: June 18, 2017Date of Patent: March 21, 2023Assignee: GOOGLE LLCInventors: Ankur Bapna, Larry Paul Heck
-
Patent number: 11599768Abstract: A method for recommending an action to a user of a user device includes receiving first user action data corresponding to a first user action and receiving second user action data corresponding to a second user action. The method also includes generating, based on the first user action data and the second user action data and using a feedforward artificial neural network, a recommendation for a next user action. The method also includes causing the recommendation for the next user action to be communicated to the user device.Type: GrantFiled: July 18, 2019Date of Patent: March 7, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Kai Niu, Jiali Huang, Christopher H. Doan, Michael D. Elder
-
Patent number: 11593634Abstract: This disclosure relates to methods, non-transitory computer readable media, and systems that asynchronously train a machine learning model across client devices that implement local versions of the model while preserving client data privacy. To train the model across devices, in some embodiments, the disclosed systems send global parameters for a global machine learning model from a server device to client devices. A subset of the client devices uses local machine learning models corresponding to the global model and client training data to modify the global parameters. Based on those modifications, the subset of client devices sends modified parameter indicators to the server device for the server device to use in adjusting the global parameters. By utilizing the modified parameter indicators (and not client training data), in certain implementations, the disclosed systems accurately train a machine learning model without exposing training data from the client device.Type: GrantFiled: June 19, 2018Date of Patent: February 28, 2023Assignee: Adobe Inc.Inventors: Sunav Choudhary, Saurabh Kumar Mishra, Manoj Ghuhan A, Ankur Garg
-
Patent number: 11586964Abstract: Methods, apparatus, and processor-readable storage media for device component management using deep learning techniques are provided herein. An example computer-implemented method includes obtaining telemetry data from one or more enterprise devices; determining, for each of the one or more enterprise devices, values for multiple device attributes by processing the obtained telemetry data; generating, for each of the one or more enterprise devices, at least one prediction related to lifecycle information of at least one device component by processing the determined attribute values using one or more deep learning techniques; and performing one or more automated actions based at least in part on the at least one generated prediction.Type: GrantFiled: January 30, 2020Date of Patent: February 21, 2023Assignee: Dell Products L.P.Inventors: Parminder Singh Sethi, Akanksha Goel, Hung T. Dinh, Sabu K. Syed, James S. Watt, Kannappan Ramu
-
Patent number: 11580978Abstract: Provided is an in-ear device and associated computational support system that leverages machine learning to interpret sensor data descriptive of one or more in-ear phenomena during subvocalization by the user. An electronic device can receive sensor data generated by at least one sensor at least partially positioned within an ear of a user, wherein the sensor data was generated by the at least one sensor concurrently with the user subvocalizing a subvocalized utterance. The electronic device can then process the sensor data with a machine-learned subvocalization interpretation model to generate an interpretation of the subvocalized utterance as an output of the machine-learned subvocalization interpretation model.Type: GrantFiled: November 24, 2020Date of Patent: February 14, 2023Assignee: Google LLCInventors: Yaroslav Volovich, Ant Oztaskent, Blaise Aguera-Arcas
-
Patent number: 11580957Abstract: Disclosed are a method for training speech recognition model, a method and a system for speech recognition. The disclosure relates to field of speech recognition and includes: inputting an audio training sample into the acoustic encoder to represent acoustic features of the audio training sample in an encoded way and determine an acoustic encoded state vector; inputting a preset vocabulary into the language predictor to determine text prediction vector; inputting the text prediction vector into the text mapping layer to obtain a text output probability distribution; calculating a first loss function according to a target text sequence corresponding to the audio training sample and the text output probability distribution; inputting the text prediction vector and the acoustic encoded state vector into the joint network to calculate a second loss function, and performing iterative optimization according to the first loss function and the second loss function.Type: GrantFiled: June 9, 2022Date of Patent: February 14, 2023Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCESInventors: Jianhua Tao, Zhengkun Tian, Jiangyan Yi
-
Patent number: 11580980Abstract: A method and apparatus for generating a user intention understanding satisfaction evaluation model, a method and apparatus for evaluating a user intention understanding satisfaction, an electronic device and a storage medium are provided, relating to intelligent voice recognition and knowledge graphs.Type: GrantFiled: January 22, 2021Date of Patent: February 14, 2023Inventors: Yanyan Li, Jianguo Duan, Hui Xiong
-
Patent number: 11574253Abstract: A computer implemented method trains distributed sets of machine learning models by training each of the distributed machine learning models on different subsets of a set of training data, performing a first layer model synchronization operation in a first layer for each set of machine learning models, wherein each model synchronization operation in the first layer generates first updates for each of the machine learning models in each respective set, updating the machine learning models based on the first updates, performing a second layer model synchronization operation in a second layer for first supersets of the machine learning models wherein each model synchronization in the second layer generates second updates for updating each of the machine learning models in the first supersets based on the second updates such that each machine learning model in a respective first superset is the same.Type: GrantFiled: August 1, 2019Date of Patent: February 7, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Ivo José Garcia dos Santos, Mehdi Aghagolzadeh, Rihui Peng
-
Patent number: 11568235Abstract: Embodiments for implementing mixed precision learning for neural networks by a processor. A neural network may be replicated into a plurality of replicated instances and each of the plurality of replicated instances differ in precision used for representing and determining parameters of the neural network. Data instances may be routed to one or more of the plurality of replicated instances for processing according to a data pre-processing operation.Type: GrantFiled: November 19, 2018Date of Patent: January 31, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Zehra Sura, Parijat Dube, Bishwaranjan Bhattacharjee, Tong Chen
-
Patent number: 11562734Abstract: The present disclosure relates to an automatic speech recognition system and a method thereof. The system includes a conformer encoder and a pair of ping-pong buffers. The encoder includes a plurality of encoder layers sequentially executed by one or more graphic processing units. At least one encoder layer includes a first feed forward module, a multi-head self-attention module, a convolution module, and a second feed forward module. The convolution module and the multi-head self-attention module are sandwiched between the first feedforward module and the second feed forward module. The four modules respectively include a plurality of encoder sublayers fused into one or more encoder kernels. The one or more encoder kernels respectively read from one of the pair of ping-pong buffers and write into the other of the pair of ping-pong buffers.Type: GrantFiled: January 4, 2021Date of Patent: January 24, 2023Assignee: KWAI INC.Inventors: Yongxiong Ren, Yang Liu, Heng Liu, Lingzhi Liu, Jie Li, Kaituo Xu, Xiaorui Wang
-
Patent number: 11557292Abstract: A system and method performs speech command verification to determine if audio data includes a representation of a speech command. A first neural network may process portions of the audio data before and after a representation of a wake trigger in the audio data. A second neural network may process the audio data using a recurrent neural network to determine if the audio data includes a representation of a wake trigger.Type: GrantFiled: December 9, 2020Date of Patent: January 17, 2023Assignee: Amazon Technologies, Inc.Inventors: Joseph Wang, Michael J Rodehorst, Rajath Kumar Mysore Pradeep Kumar
-
Patent number: 11551707Abstract: Disclosed is a method for speech processing, an information device, and a computer program product. The method for speech processing, as implemented by a computer, includes: obtaining a mixed speech signal via a microphone, wherein the mixed speech signal includes a plurality of speech signals uttered by a plurality of unspecified speakers at the same time; generating a set of simulated speech signals according to the mixed speech signal by using a Generative Adversarial Network (GAN), in order to simulate the plurality of speech signals; determining the number of the simulated speech signals in order to estimate the number of the speakers in the surroundings and providing the number as an input of an information application.Type: GrantFiled: August 27, 2019Date of Patent: January 10, 2023Assignee: RELAJET TECH (TAIWAN) CO., LTD.Inventors: Yun-Shu Hsu, Po-Ju Chen
-
Patent number: 11514091Abstract: Methods and systems for processing records include extracting feature vectors from words in an unstructured portion of a record. The feature vectors are weighted based similarity to a topic vector from a structured portion of the record associated with the unstructured portion. The weighted feature vectors are classified using a machine learning model to determine respective probability vectors that assign a probability to each of a set of possible relations for each feature vector. Relations between entities are determined within the record based on the probability vectors. An action is performed responsive to the determined relations.Type: GrantFiled: January 7, 2019Date of Patent: November 29, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ke Wang, Pei Ni Liu, Wen Sun, Jing Min Xu, Songfang Huang, Yong Qin
-
Patent number: 11514318Abstract: Examples described herein provide a computer-implemented method that includes training, by one or more processing devices, a first neural network for classification based on training data in accordance with a first learning objective, the first neural network producing an intermediate feature function and a final feature function as outputs. The computer-implemented method further includes training, by the one or more processing devices, a second neural network for classification based on the intermediate feature function and the final feature function and further based at least in part on target task samples in accordance with a second learning objective. Training the second neural network includes computing maximal correlation functions of each of the intermediate feature function, the final feature function, and the target task samples.Type: GrantFiled: April 8, 2020Date of Patent: November 29, 2022Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGYInventors: Joshua Ka-Wing Lee, Prasanna Sattigeri, Gregory Wornell
-
Patent number: 11508396Abstract: Systems and methods of related to a voice-based system used to determine the severity of emotional distress within an audio recording of an individual is provided. In one non-limiting example, a system comprises a computing device that is configured to receive an audio sample that includes an utterance of a user. Feature extraction is performed on the audio sample to extract a plurality of acoustic emotion features using a base model. Emotion level predictions are generated for an emotion type based at least in part on the acoustic emotion features provided to an emotion specific model. An emotion classification for the audio sample is determined based on the emotion level predictions. The emotion classification comprises the emotion type and a level associated with the emotion type.Type: GrantFiled: December 14, 2021Date of Patent: November 22, 2022Assignee: TQINTELLIGENCE, INC.Inventors: Yared Alemu, Desmond Caulley, Ashutosh A. Joshi
-
Patent number: 11507822Abstract: Systems and methods to generate artificial intelligence models with synthetic data are disclosed. An example system includes a deep neural network (DNN) generator to generate a first DNN model using first real data. The example system includes a synthetic data generator to generate first synthetic data from the first real data, the first synthetic data to be used by the DNN generator to generate a second DNN model. The example system includes an evaluator to evaluate performance of the first and second DNN models to determine whether to generate second synthetic data. The example system includes a synthetic data aggregator to aggregate third synthetic data and fourth synthetic data from a plurality of sites to form a synthetic data set. The example system includes an artificial intelligence model deployment processor to deploy an artificial intelligence model trained and tested using the synthetic data set.Type: GrantFiled: October 31, 2018Date of Patent: November 22, 2022Assignee: General Electric CompanyInventors: Ravi Soni, Min Zhang, Gopal Avinash
-
Patent number: 11495235Abstract: According to one embodiment, a system for creating a speaker model includes one or more processors. The processors change a part of network parameters from an input layer to a predetermined intermediate layer based on a plurality of patterns and inputs a piece of speech into each of neural networks so as to obtain a plurality of outputs from the intermediate layer. The part of network parameters of the each of the neural networks is changed based on one of the plurality of patterns. The processors create a speaker model with respect to one or more words detected from the speech based on the outputs.Type: GrantFiled: March 8, 2019Date of Patent: November 8, 2022Assignee: Kabushiki Kaisha ToshibaInventor: Hiroshi Fujimura
-
Patent number: 11462208Abstract: Some techniques described herein determine a correction model for a dialog system, such that the correction model corrects output from an automatic speech recognition (ASR) subsystem in the dialog system. A method described herein includes accessing training data. A first tuple of the training data includes an utterance, where the utterance is a textual representation of speech. The method further includes using an ASR subsystem of a dialog system to convert the utterance to an output utterance. The method further includes storing the output utterance in corrective training data that is based on the training data. The method further includes training a correction model based on the corrective training data, such that the correction model is configured to correct output from the ASR subsystem during operation of the dialog system.Type: GrantFiled: August 13, 2020Date of Patent: October 4, 2022Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Thanh Long Duong, Mark Edward Johnson
-
Patent number: 11461642Abstract: An apparatus for processing a signal for input to a neural network, the apparatus configured to: receive a signal comprising a plurality of samples of an analog signal over time; determine at least one frame comprising a group of consecutive samples of the signal, wherein the or each frame includes a first number of samples; for each frame, determine a set of correlation values comprising a second number of correlation values, the second number less than the first number, each correlation value of the set of correlation values based on an autocorrelation of the frame at a plurality of different time lags; provide an output based on the set of correlation values corresponding to the or each of the frames for a neural network for one or more of classification of the analog signal by the neural network and training the neural network based on a predetermined classification.Type: GrantFiled: September 11, 2019Date of Patent: October 4, 2022Assignee: NXP B.V.Inventors: Jose De Jesus Pineda de Gyvez, Hamed Fatemi, Emad Ayman Taleb Ibrahim
-
Patent number: 11450310Abstract: Systems and methods for spoken language understanding are described. Embodiments of the systems and methods receive audio data for a spoken language expression, encode the audio data using a multi-stage encoder comprising a basic encoder and a sequential encoder, wherein the basic encoder is trained to generate character features during a first training phase and the sequential encoder is trained to generate token features during a second training phase, and decode the token features to generate semantic information representing the spoken language expression.Type: GrantFiled: August 10, 2020Date of Patent: September 20, 2022Assignee: ADOBE INC.Inventors: Nikita Kapoor, Jaya Dodeja, Nikaash Puri
-
Patent number: 11437050Abstract: Techniques are described for coding audio signals. For example, using a neural network, a residual signal is generated for a sample of an audio signal based on inputs to the neural network. The residual signal is configured to excite a long-term prediction filter and/or a short-term prediction filter. Using the long-term prediction filter and/or the short-term prediction filter, a sample of a reconstructed audio signal is determined. The sample of the reconstructed audio signal is determined based on the residual signal generated using the neural network for the sample of the audio signal.Type: GrantFiled: December 10, 2019Date of Patent: September 6, 2022Assignee: QUALCOMM IncorporatedInventors: Zisis Iason Skordilis, Vivek Rajendran, Guillaume Konrad Sautière, Daniel Jared Sinder
-
Patent number: 11410656Abstract: The system identifies one or more entities or content items among a plurality of stored information. The system generates an audio file based on a first text string that represents the entity or content item. Based on the first text string and at least one speech criterion, the system generating, using a speech-to-text module a second text string based on the audio file. The system then compares the text strings and stores the second text string if it is not identical to the first text string. The system generates metadata that includes results from text-speech-text conversions to forecast possible misidentifications when responding to voice queries during search operations. The metadata includes alternative representations of the entity.Type: GrantFiled: July 31, 2019Date of Patent: August 9, 2022Assignee: ROVI GUIDES, INC.Inventors: Ankur Aher, Indranil Coomar Doss, Aashish Goyal, Aman Puniyani, Kandala Reddy, Mithun Umesh
-
Patent number: 11398220Abstract: A speech processing method executes at least one of first speech processing and second speech processing. The first speech processing identifies a language based on speech, performs signal processing according to the identified language, and transmits the speech on which the signal processing has been performed, to a far-end-side. The second speech processing identifies a language based on speech, receives the speech from the far-end-side, and performs signal processing on the received speech, according to the identified language.Type: GrantFiled: August 28, 2019Date of Patent: July 26, 2022Assignee: Yamaha CorporationInventor: Mikio Muramatsu
-
Patent number: 11397784Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving user-specific content, the user-specific content being associated with a user of one or more computer-implemented services, processing the user-specific content using one or more parsers to identify one or more entities and one or more relationships between entities, a parser being specific to a schema, and the one or more entities and the one or more relationships between entities being identified based on the schema, providing one or more user-specific knowledge graphs, a user-specific knowledge graph being specific to the user and including nodes and edges between nodes to define relationships between entities based on the schema, and storing the one or more user-specific knowledge graphs.Type: GrantFiled: August 14, 2019Date of Patent: July 26, 2022Assignee: GOOGLE LLCInventors: Pranav Khaitan, Shobha Diwakar
-
Patent number: 11393457Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using complex linear projection are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to an utterance. The method further includes generating frequency domain data using the audio data. The method further includes processing the frequency domain data using complex linear projection. The method further includes providing the processed frequency domain data to a neural network trained as an acoustic model. The method further includes generating a transcription for the utterance that is determined based at least on output that the neural network provides in response to receiving the processed frequency domain data.Type: GrantFiled: May 20, 2020Date of Patent: July 19, 2022Assignee: Google LLCInventors: Samuel Bengio, Mirko Visontai, Christopher Walter George Thornton, Tara N. Sainath, Ehsan Variani, Izhak Shafran, Michiel A. u. Bacchiani
-
Patent number: 11380315Abstract: One embodiment of the present invention sets forth a technique for analyzing a transcription of a recording. The technique includes generating features representing transcriptions produced by multiple automatic speech recognition (ASR) engines from voice activity in the recording and a best transcription of the recording produced by an ensemble model from the transcriptions. The technique also includes applying a machine learning model to the features to produce a score representing an accuracy of the best transcription. The technique further includes storing the score in association with the best transcription.Type: GrantFiled: March 9, 2019Date of Patent: July 5, 2022Assignee: CISCO TECHNOLOGY, INC.Inventors: Ahmad Abdulkader, Mohamed Gamal Mohamed Mahmoud
-
Patent number: 11380312Abstract: A system configured to improve wakeword detection. The system may selectively rectify (e.g., attenuate) a portion of an audio signal based on energy statistics corresponding to a keyword (e.g., wakeword). For example, a device may perform echo cancellation to generate isolated audio data, may use the energy statistics to calculate signal quality metric values for a plurality of frequency bands of the isolated audio data, and may select a fixed number of frequency bands (e.g., 5-10%) associated with lowest signal quality metric values. To detect a specific keyword, the system determines a threshold ?(f) corresponding to an expected energy value at each frequency band. During runtime, the device determines signal quality metric values by subtracting residual music from the expected energy values. Thus, the device attenuates only a portion of the total number of frequency bands that include more energy than expected based on the energy statistics of the wakeword.Type: GrantFiled: June 20, 2019Date of Patent: July 5, 2022Assignee: Amazon Technologies, Inc.Inventor: Mohamed Mansour
-
Patent number: 11380348Abstract: A method for correcting infant crying identification includes the following steps: a detecting step provides an audio unit to detect a sound around an infant to generate a plurality of audio samples. A converting step provides a processing unit to convert the audio samples to generate a plurality of audio spectrograms. An extracting step provides a common model to extract the audio spectrograms to generate a plurality of infant crying features. An incremental training step provides an incremental model to train the infant crying features to generate an identification result. A judging step provides the processing unit to judge whether the identification result is correct according to a real result of the infant. When the identification result is different from the real result, an incorrect result is generated. A correcting step provides the processing unit to correct the incremental model according to the incorrect result.Type: GrantFiled: August 27, 2020Date of Patent: July 5, 2022Assignee: NATIONAL YUNLIN UNIVERSITY OF SCIENCE AND TECHNOLOGYInventors: Chuan-Yu Chang, Jun-Ying Li
-
Patent number: 11373653Abstract: A method and system for detecting speech using close sensor applications, according to some embodiments. In some embodiments, a close microphone is applied to detect sounds with higher muscle or bone transmission components. In some embodiments, a close camera is applied that collects visual information and motion that is correlated with the potential phonemes for such positions and motion. In some embodiments, myography is performed, to detect muscle movement. In an earbud form factor embodiment, processing of different channels of close information is performed to improve the accuracy of the recognition.Type: GrantFiled: January 17, 2020Date of Patent: June 28, 2022Inventor: Joseph Alan Epstein
-
Patent number: 11361768Abstract: A method includes receiving a spoken utterance that includes a plurality of words, and generating, using a neural network-based utterance classifier comprising a stack of multiple Long-Short Term Memory (LSTM) layers, a respective textual representation for each word of the of the plurality of words of the spoken utterance. The neural network-based utterance classifier trained on negative training examples of spoken utterances not directed toward an automated assistant server. The method further including determining, using the respective textual representation generated for each word of the plurality of words of the spoken utterance, that the spoken utterance is one of directed toward the automated assistant server or not directed toward the automated assistant server, and when the spoken utterance is directed toward the automated assistant server, generating instructions that cause the automated assistant server to generate a response to the spoken utterance.Type: GrantFiled: July 21, 2020Date of Patent: June 14, 2022Assignee: Google LLCInventors: Nathan David Howard, Gabor Simko, Maria Carolina Parada San Martin, Ramkarthik Kalyanasundaram, Guru Prakash Arumugam, Srinivas Vasudevan
-
Patent number: 11354459Abstract: A synthetic world interface may be used to model digital environments, sensors, and motions for the evaluation, development, and improvement of computer vision and speech algorithms. A synthetic data cloud service with a library of sensor primitives, motion generators, and environments with procedural and game-like capabilities, facilitates engineering design for a manufactural solution that has computer vision and speech capabilities. In some embodiments, a sensor platform simulator operates with a motion orchestrator, an environment orchestrator, an experiment generator, and an experiment runner to test various candidate hardware configurations and computer vision and speech algorithms in a virtual environment, advantageously speeding development and reducing cost. Thus, examples disclosed herein may relate to virtual reality (VR) or mixed reality (MR) implementations.Type: GrantFiled: September 21, 2018Date of Patent: June 7, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Michael Ebstyne, Pedro Urbina Escos, Yuri Pekelny, Jonathan Chi Hang Chan, Emanuel Shalev, Alex Kipman, Mark Flick
-
Patent number: 11343632Abstract: The invention relates to a method for broadcasting a spatialized audio stream to terminals of spectators attending a sports event. The method comprises the acquisition of a plurality of audio streams constituting a soundscape. The soundscape is analyzed by a server in order for the sound spatialization of the audio streams and of the playback thereof on terminals, depending both on the localization of the audio flows and also the position of the spectators.Type: GrantFiled: September 29, 2020Date of Patent: May 24, 2022Assignee: INSTITUT MINES TELECOMInventors: Raphael Blouet, Slim Essid
-
Patent number: 11341242Abstract: Disclosed is a computer implemented method for malware detection that analyses a file on a per packet basis. The method receives a packet of one or more packets associated a file, and converting a binary content associated with the packet into a digital representation and tokenizing plain text content associated with the packet. The method extracts one or more n-gram features, an entropy feature, and a domain feature from the converted content of the packet and applies a trained machine learning model to the one or more features extracted from the packet. The output of the machine learning method is a probability of maliciousness associated with the received packet. If the probability of maliciousness is above a threshold value, the method determines that the file associated with the received packet is malicious.Type: GrantFiled: October 12, 2020Date of Patent: May 24, 2022Assignee: Zscaler, Inc.Inventors: Huihsin Tseng, Hao Xu, Jian L. Zhen
-
Patent number: 11315550Abstract: A speaker recognition device according to the present disclosure includes: an acoustic feature calculator that calculates, from utterance data indicating a voice of an obtained utterance, acoustic feature of the voice of the utterance; a statistic calculator that calculates an utterance data statistic from the calculated acoustic feature; a speaker feature extractor that extracts speaker feature of a speaker of the utterance data from the calculated utterance data statistic using a deep neural network (DNN); a similarity calculator that calculates a similarity between the extracted speaker feature and pre-stored speaker feature of at least one registered speaker; and a speaker recognizer that recognizes the speaker of the utterance data based on the calculated similarity.Type: GrantFiled: November 13, 2019Date of Patent: April 26, 2022Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICAInventors: Kousuke Itakura, Ko Mizuno, Misaki Doi
-
Patent number: 11302303Abstract: A method and device for training an acoustic model are provided. The method comprises determining a plurality of tasks for training an acoustic model, obtaining resource occupancies of nodes participating in the training of the acoustic model, and distributing the tasks to the nodes according to the resource occupancies of the nodes and complexities of the tasks. By using computational resources distributed at multiple nodes, tasks for training an acoustic model are performed in parallel in a distributed manner, so as to improve training efficiency.Type: GrantFiled: September 13, 2019Date of Patent: April 12, 2022Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.Inventors: Yunfeng Li, Qingchang Hao, Yutao Gai, Chenxi Sun, Zhiping Zhou
-
Patent number: 11295203Abstract: Neuron placement in a neuromorphic system to minimize cumulative delivery delay is provided. In some embodiments, a neural network description describing a plurality of neurons is read. A relative delivery delay associated with each of the plurality of neurons is determined. An ordering of the plurality of neurons is determined to optimize cumulative delivery delay over the plurality of neurons. An optimized neural network description based on the ordering of the plurality of neurons is written.Type: GrantFiled: July 27, 2016Date of Patent: April 5, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Rodrigo Alvarez-Icaza, Pallab Datta, Jeffrey A. Kusnitz
-
Patent number: 11289092Abstract: A method, system and computer program product for editing a text using speech recognition includes receiving, by a computer, a first voice input from a user comprising a first target word. The computer identifies instances of the first target word within the text and assigns a first numerical indicator to each instance of the first target word within the text. A selection is received from the user including the first numerical indicator corresponding to a starting point of a selection area. The computer receives a second voice input from the user including a second target word, identifies instances of the second target word within the text, assigns a second numerical indicator to each instance of the second target word, and receives a selection from the user including the second numerical indicator corresponding to an ending point of the selection area.Type: GrantFiled: September 25, 2019Date of Patent: March 29, 2022Assignee: International Business Machines CorporationInventors: JunXing Yang, XueJun Zhong, Wei Sun, ZhiXia Wang
-
Patent number: 11282535Abstract: Disclosed is an electronic apparatus. The electronic apparatus includes a storage for storing a plurality of filters trained in a plurality of convolutional neural networks (CNNs) respectively and a processor configured to acquire a first spectrogram corresponding to a damaged audio signal, input the first spectrogram to a CNN corresponding to each frequency band to apply the plurality of filters trained in the plurality of CNNs respectively, acquire a second spectrogram by merging output values of the CNNs to which the plurality of filters are applied, and acquire an audio signal reconstructed based on the second spectrogram.Type: GrantFiled: July 19, 2018Date of Patent: March 22, 2022Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ki-Hyun Choo, Anton Porov, Jong-Hoon Jeong, Ho-Sang Sung, Eun-Mi Oh, Jong-Youb Ryu
-
Patent number: 11263516Abstract: Methods and systems for training a neural network include identifying weights in a neural network between a final hidden neuron layer and an output neuron layer that correspond to state matches between a neuron of the final hidden neuron layer and a respective neuron of the output neuron layer. The identified weights are initialized to a predetermined non-zero value and initializing other weights between the final hidden neuron layer and the output neuron layer to zero. The neural network is trained based on a training corpus after initialization.Type: GrantFiled: August 2, 2016Date of Patent: March 1, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Gakuto Kurata
-
Patent number: 11257483Abstract: Spoken language understanding techniques include training a dynamic neural network mask relative to a static neural network using only post-deployment training data such that the mask zeroes out some of the weights of the static neural network and allows some other weights to pass through and applying a dynamic neural network corresponding to the masked static neural network to input queries to identify outputs for the queries.Type: GrantFiled: March 29, 2019Date of Patent: February 22, 2022Assignee: Intel CorporationInventors: Krzysztof Czarnowski, Munir Georges
-
Patent number: 11256973Abstract: A neural network embodiment comprises an input layer, an output layer and a filter layer. Each unit of the filter layer receives a filter layer input from a single preceding unit via a respective filter layer input connection. Each filter layer input connection is coupled to a different single preceding unit. The filter layer incentivizes the neural network to learn to produce a target output from the output layer for a given input to the input layer while simultaneously learning weights for each filter layer input connection. The weights learned cause the filter layer to reduce a number of filter layer units that pass respective filter layer inputs as non-zero values. When applied as an initial internal layer between an input layer and an output layer, the filter layer incentivizes the neural network to learn which neural network input features to discard to produce the target output.Type: GrantFiled: February 5, 2018Date of Patent: February 22, 2022Assignee: Nuance Communications, Inc.Inventors: Nasr Madi, Neil D. Barrett
-
Patent number: 11252152Abstract: An online system authenticates a user through a voiceprint biometric verification process. When a user needs to be authenticated, the online system generates and provides a random phrase to the user. The online system receives an audio recording of the randomly generated phrase and retrieves a previously trained voiceprint model for the user. The online system analyzes the audio recording by applying the voiceprint model to determine whether the audio recording satisfies a first criteria of whether the voice in the audio recording belongs the user and a second criteria of whether the audio recording includes a vocalization of the randomly generated phrase. If the audio recording satisfies both criteria, the online system authenticates the user. Therefore, the user can be provided access to a new communication session in response to being authenticated.Type: GrantFiled: June 3, 2020Date of Patent: February 15, 2022Assignee: salesforce.com, inc.Inventor: Eugene Lew
-
Patent number: 11244668Abstract: A method for generating speech animation from an audio signal includes: receiving the audio signal; transforming the received audio signal into frequency-domain audio features; performing neural-network processing on the frequency-domain audio features to recognize phonemes, wherein the neural-network processing is performed using a neural network trained with a phoneme dataset comprising of audio signals with corresponding ground-truth phoneme labels; and generating the speech animation from the recognized phonemes.Type: GrantFiled: May 29, 2020Date of Patent: February 8, 2022Assignee: TCL RESEARCH AMERICA INC.Inventors: Zixiao Yu, Haohong Wang
-
Patent number: 11244671Abstract: A model training method and apparatus is disclosed, where the model training method acquires first output data of a student model for first input data and second output data of a teacher model for second input data and trains the student model such that the first output data and the second output data are not distinguished from each other. The student model and the teacher model have different structures.Type: GrantFiled: August 23, 2019Date of Patent: February 8, 2022Assignee: Samsung Electronics Co., Ltd.Inventors: Hogyeong Kim, Hyohyeong Kang, Hwidong Na, Hoshik Lee
-
Patent number: 11232141Abstract: A method for processing an electronic document comprising text is disclosed. The method comprises: splitting the text into at least one sentence, and for each said sentence: associating each word of the sentence with a word-vector; representing the sentence by a sentence-vector, wherein obtaining the sentence-vector comprises computing a weighted average of all word-vectors associated with the sentence; if it is determined that the sentence-vector is associated with a tag in a data set of sentence-vectors associated with tags, obtaining the tag from the database; otherwise, obtaining a tag for the sentence-vector using a classification algorithm; processing the sentence if the tag obtained for the sentence is associated with a predetermined label.Type: GrantFiled: September 13, 2018Date of Patent: January 25, 2022Inventors: Youness Mansar, Sira Ferradans, Jacopo Staiano
-
Patent number: 11227611Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.Type: GrantFiled: June 3, 2020Date of Patent: January 18, 2022Assignee: Google LLCInventors: Andrew E. Rubin, Johan Schalkwyk, Maria Carolin Parada San Martin
-
Patent number: 11227626Abstract: An audio response system can generate multimodal messages that can be dynamically updated on viewer's client device based on a type of audio response detected. The audio responses can include keywords or continuum-based signal (e.g., levels of wind noise). A machine learning scheme can be trained to output classification data from the audio response data for content selection and dynamic display updates.Type: GrantFiled: May 21, 2019Date of Patent: January 18, 2022Assignee: Snap Inc.Inventors: Gurunandan Krishnan Gorumkonda, Shree K. Nayar
-
Patent number: 11222641Abstract: A speaker recognition device includes: a feature calculator that calculates two or more acoustic features of a voice of an utterance obtained; a similarity calculator that calculates two or more similarities, each being a similarity between one of one or more speaker-specific features of a target speaker for recognition and one of the two or more acoustic features; a combination unit that combines the two or more similarities to obtain a combined value; and a determiner that determines whether a speaker of the utterance is the target speaker based on the combined value. Here, (i) at least two of the two or more acoustic features have different properties, (ii) at least two of the two or more similarities have different properties, or (iii) at least two of the two or more acoustic features have different properties and at least two of the two or more similarities have different properties.Type: GrantFiled: September 19, 2019Date of Patent: January 11, 2022Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICAInventor: Kousuke Itakura