Patents Examined by Bryan S Blankenagel
-
Patent number: 11494643Abstract: A noise data artificial intelligence learning method for identifying the source of problematic noise may include a noise data pre-conditioning method for identifying the source of problematic noise including: selecting a unit frame for the problematic noise among noises sampled with time; dividing the unit frame into N segments; analyzing frequency characteristic for each segment of the N segments and extracting a frequency component of each segment by applying Log Mel Filter; and outputting a feature parameter as one representative frame by averaging information on the N segments, wherein an artificial intelligence learning by the feature parameter extracted according to a change in time by the noise data pre-conditioning method applies Bidirectional RNN.Type: GrantFiled: November 18, 2019Date of Patent: November 8, 2022Assignees: Hyundai Motor Company, Kia Motors Corporation, IUCF-HYU (Industry-University Corporation Foundation Hanyang University)Inventors: Dong-Chul Lee, In-Soo Jung, Joon-Hyuk Chang, Kyoung-Jin Noh
-
Patent number: 11482244Abstract: A method includes receiving an overlapped audio signal that includes audio spoken by a speaker that overlaps a segment of synthesized playback audio. The method also includes encoding a sequence of characters that correspond to the synthesized playback audio into a text embedding representation. For each character in the sequence of characters, the method also includes generating a respective cancelation probability using the text embedding representation. The cancelation probability indicates a likelihood that the corresponding character is associated with the segment of the synthesized playback audio overlapped by the audio spoken by the speaker in the overlapped audio signal.Type: GrantFiled: March 11, 2021Date of Patent: October 25, 2022Assignee: Google LLCInventor: Quan Wang
-
Patent number: 11481563Abstract: The present disclosure describes systems, non-transitory computer-readable media, and methods that can generate contextual identifiers indicating context for frames of a video and utilize those contextual identifiers to generate translations of text corresponding to such video frames. By analyzing a digital video file, the disclosed systems can identify video frames corresponding to a scene and a term sequence corresponding to a subset of the video frames. Based on images features of the video frames corresponding to the scene, the disclosed systems can utilize a contextual neural network to generate a contextual identifier (e.g. a contextual tag) indicating context for the video frames. Based on the contextual identifier, the disclosed systems can subsequently apply a translation neural network to generate a translation of the term sequence from a source language to a target language. In some cases, the translation neural network also generates affinity scores for the translation.Type: GrantFiled: November 8, 2019Date of Patent: October 25, 2022Assignee: Adobe Inc.Inventors: Mahika Wason, Amol Jindal, Ajay Bedi
-
Patent number: 11462231Abstract: A system configured to perform low input-output latency noise reduction in a frequency domain is provided. The real-time noise reduction algorithm performs frame by frame processing of a single-channel noisy acoustic signal to estimate a gain function. Accurate noise power estimates are achieved with the help of minimum statistics approach followed by a voice activity detector. The noise power and gain values are smoothed to remove any external artifacts and avoid background noise modulations. The gain values for individual frequency bands are weighted and smoothed to reduce distortion. To obtain distortionless output speech, the system performs curve fitting by separating the frequency bands into multiple groups and applying a Savitzky-Golay filter to each group. The final gain values generated by these filters are multiplied with the noisy speech signal to obtain a clean speech signal.Type: GrantFiled: November 18, 2020Date of Patent: October 4, 2022Assignee: Amazon Technologies, Inc.Inventors: Nikhil Shankar, Berkant Tacer
-
Patent number: 11450332Abstract: To be able to convert to a voice of the desired attribution. Learning an encoder for, on the basis of parallel data of a sound feature vector series in a conversion-source voice signal and a latent vector series in the conversion-source voice signal, and an attribution label indicating attribution of the conversion-source voice signal, estimating a latent vector series from input of a sound feature vector series and an attribution label, and a decoder for reconfiguring the sound feature vector series from input of the latent vector series and the attribution label.Type: GrantFiled: February 20, 2019Date of Patent: September 20, 2022Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hirokazu Kameoka, Takuhiro Kaneko, Ko Tanaka, Nobukatsu Hojo
-
Patent number: 11416689Abstract: The invention refers to a natural language processing system configured for receiving an input sequence ci of input words (v1, v2, . . . vN) representing a first sequence of words in a natural language of a first text and generating an output sequence of output words (, , . . . ) representing a second sequence of words in a natural language of a second text and modeled by a multinominal topic model, wherein the multinominal topic model is extended by an incorporation of language structures using a deep contextualized Long-Short-Term Memory model.Type: GrantFiled: March 28, 2019Date of Patent: August 16, 2022Assignee: SIEMENS AKTIENGESELLSCHAFTInventors: Florian Büttner, Yatin Chaudhary, Pankaj Gupta
-
Patent number: 11416755Abstract: The present system and method may generally include organizing the task flow of a virtual agent in a way that is controlled by a set of rules and set of conditional probability distributions. The system and method may include receiving a user utterance including a first task, identifying the first task from the user utterance, and obtaining a set of rules related to the plurality of tasks. The set of rules may determine whether pre-tasks and/or pre-conditions are to be executed before executing the first task. The set of rules may also determine whether post-tasks and/or post-conditions are to be executed after executing the first task. The system and method may include executing the task; running a probabilistic graphical model on the plurality of tasks to determine a second task based on the first task; suggesting to the user the second task; and updating the probabilistic graphical model after a threshold number of runs.Type: GrantFiled: August 30, 2019Date of Patent: August 16, 2022Assignee: Accenture Global Solutions LimitedInventors: Roshni Ramesh Ramnani, Shubhashis Sengupta, Moushumi Mahato, Sukanti Beer
-
Patent number: 11404056Abstract: A drone system is configured to capture an audio stream that includes voice commands from an operator, to process the audio stream for identification of the voice commands, and to perform operations based on the identified voice commands. The drone system can identify a particular voice stream in the audio stream as an operator voice, and perform the command recognition with respect to the operator voice to the exclusion of other voice streams present in the audio stream. The drone can include a directional camera that is automatically and continuously focused on the operator to capture a video stream usable in disambiguation of different voice streams captured by the drone.Type: GrantFiled: June 30, 2017Date of Patent: August 2, 2022Assignee: Snap Inc.Inventors: David Meisenholder, Steven Horowitz
-
Patent number: 11393452Abstract: The present invention relates to methods of converting a speech into another speech that sounds more natural. The method includes learning for a target conversion function and a target identifier according to an optimal condition in which the target conversion function and the target identifier compete with each other. The target conversion function converts source speech into target speech. The target identifier identifies whether the converted target speech follows the same distribution as actual target speech. The methods include learning for a source conversion function and a source identifier according to an optimal condition in which the source conversion function and the source identifier compete with each other. The source conversion function converts target speech into source speech, and the source identifier identifies whether the converted source speech follows the same distribution as actual source speech.Type: GrantFiled: February 20, 2019Date of Patent: July 19, 2022Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ko Tanaka, Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo
-
Patent number: 11386920Abstract: Assistive technologies are herein provided to assist leaders in engaging one or more group participants using a combination of private data specific to a participant and public data specific to a participant. The system includes: a group bot that has public group data and private group data, a first bot for a first participant that has private data and public data associated with the first participant, and a leader bot for a leader. The leader bot is data interactive with the group bot and the first bot, and can cause the first bot to appropriately serve private data on a permissioned private device of the first participant and to serve public data on a permissioned group output device.Type: GrantFiled: November 15, 2019Date of Patent: July 12, 2022Assignee: FACET LABS, LLCInventors: Stuart Ogawa, Lindsay Sparks, Koichi Nishimura, Wilfred P. So, Jane W. Chen
-
Patent number: 11380345Abstract: Transforming a voice of a speaker to a reference timbre includes converting a first portion of a source signal of the voice of the speaker into a time-frequency domain to obtain a time-frequency signal; obtaining frequency bin means of magnitudes over time of the time-frequency signal; converting the frequency bin magnitude means into a Bark domain to obtain a source frequency response curve (SR), where SR(i) corresponds to magnitude mean of the ith frequency bin; obtaining respective gains of frequency bins of the Bark domain with respect to a reference frequency response curve (Rf); obtaining equalizer parameters using the respective gains of the frequency bins of the Bark domain; and transforming the first portion to the reference timbre using the equalizer parameters.Type: GrantFiled: October 15, 2020Date of Patent: July 5, 2022Assignee: Agora Lab, Inc.Inventors: Jianyuan Feng, Ruixiang Hang, Linsheng Zhao, Fan Li
-
Patent number: 11380347Abstract: In some examples, adaptive speech intelligibility control for speech privacy may include determining, based on background noise at a near-end of a speaker, a noise estimate associated with speech emitted from the speaker, and comparing, by using a specified factor, the noise estimate to a speech level estimate for the speech emitted from the speaker. Adaptive speech intelligibility control for speech privacy may further include determining, based on the comparison, a gain value to be applied to the speaker to produce the speech at a specified level to maintain on-axis intelligibility with respect to the speaker, and applying the gain value to the speaker.Type: GrantFiled: February 1, 2017Date of Patent: July 5, 2022Assignee: Hewlett-Packard Development Company, L.P.Inventors: Sunil Bharitkar, Wensen Liu, Madhu Sudan Athreya, Richard Sweet
-
Patent number: 11366978Abstract: A data recognition method includes: extracting a feature map from input data based on a feature extraction layer of a data recognition model; pooling component vectors from the feature map based on a pooling layer of the data recognition model; and generating an embedding vector by recombining the component vectors based on a combination layer of the data recognition model.Type: GrantFiled: March 7, 2019Date of Patent: June 21, 2022Assignee: Samsung Electronics Co., Ltd.Inventors: Insoo Kim, Kyuhong Kim, Chang Kyu Choi
-
Patent number: 11361753Abstract: Systems are configured for generating spectrogram data characterized by a voice timbre of a target speaker and a prosody style of source speaker by converting a waveform of source speaker data to phonetic posterior gram (PPG) data, extracting additional prosody features from the source speaker data, and generating a spectrogram based on the PPG data and the extracted prosody features. The systems are configured to utilize/train a machine learning model for generating spectrogram data and for training a neural text-to-speech model with the generated spectrogram data.Type: GrantFiled: September 24, 2020Date of Patent: June 14, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Shifeng Pan, Lei He, Yulin Li, Sheng Zhao, Chunling Ma
-
Patent number: 11354502Abstract: Methods, systems and computer program products for automatic extraction and testing of constraints are provided herein. A computer-implemented method includes obtaining a first set of documents describing constraints and a second set of documents describing properties of entities, building a first dictionary of entity types and a second dictionary of relations among the entity types, extracting constraint triples representing the set of constraints from the first set of documents, and extracting fact triples from the second set of documents utilizing the first dictionary and the second dictionary. The method also includes receiving a query to evaluate whether at least one of the set of constraints is satisfied, determining whether the at least one constraint is satisfied by evaluating a constraint satisfaction formula utilizing the constraint triples and the fact triples, and providing a response to the query that indicates whether the at least one constraint is satisfied.Type: GrantFiled: December 22, 2020Date of Patent: June 7, 2022Assignee: International Business Machines CorporationInventors: Sreyash Kenkre, Santosh R. K. Penubothula, Disha Shrivastava, Harish Guruprasad Ramaswamy, Vinayaka Pandit
-
Patent number: 11348591Abstract: A speaker identification system and method to identify a speaker based on the speaker's voice is disclosed. In an exemplary embodiment, the speaker identification system comprises a Gaussian Mixture Model (GMM) for speaker accent and dialect identification for a given speech signal input by the speaker and an Artificial Neural Network (ANN) to identify the speaker based on the identified dialect, in which the output of the GMM is input to the ANN.Type: GrantFiled: September 23, 2021Date of Patent: May 31, 2022Assignee: King Abdulaziz UniversityInventors: Muhammad Moinuddin, Ubaid M. Al-Saggaf, Shahid Munir Shah, Rizwan Ahmed Khan, Zahraa Ubaid Al-Saggaf
-
Patent number: 11348594Abstract: Methods, devices, non-transitory computer-readable medium, and systems are described for compressing audio data. The techniques involve obtaining a sequence of digitized samples of an audio signal, performing a transform using the sequence of digitized samples, to generate a plurality of spectral lines, obtaining a group of spectral lines from the plurality of spectral lines, and quantizing the group of spectral lines to generate a group of quantized values. Quantizing the group of spectral lines to generate the group of quantized values may comprise performing a specialized rounding operation on a spectral line selected from the group of spectral lines and using the specialized rounding operation to force a group parity value, computed for the group of quantized values, to a predetermined parity value. One or more data frames based on the group of quantized values may be outputted.Type: GrantFiled: June 11, 2020Date of Patent: May 31, 2022Assignee: QUALCOMM IncorporatedInventors: Richard Turner, Megan Lucy Taggart, Laurent Wojcieszak, Justin Hundt
-
Patent number: 11322162Abstract: A method, a computer-readable medium, and an apparatus for resampling audio signal are provided. The apparatus resamples the audio signal in order to preserve the audio playback quality when dealing with audio playback overrun and underrun problem. The apparatus may receive a data block of the audio signal including a first number of samples. For each sample of the first number of samples, the apparatus may slice a portion of the audio signal corresponding to the sample into a particular number of sub-samples. The apparatus may resample the data block of the audio signal into a second number of samples based on the first number of samples and the particular number of sub-samples associated with each sample of the first number of samples. The apparatus may play back the resampled data block of the audio signal via an electroacoustic device.Type: GrantFiled: November 1, 2017Date of Patent: May 3, 2022Assignee: RAZER (ASIA-PACIFIC) PTE. LTD.Inventor: Kah Yong Lee
-
Patent number: 11314937Abstract: One or more computing devices, systems, and/or methods for controlling a graphical user interface to present a representation of an item in messages are provided. For example, a message may be received from a first device. The message may be analyzed to identify an item of the message. A database of known items may be analyzed to determine whether the item is in the database of known items. Responsive to determining that the item is not in the database of known items, a database of representations may be analyzed to determine a representation of the item. A graphical user interface of a second device may be controlled to present a representation of the item.Type: GrantFiled: December 14, 2017Date of Patent: April 26, 2022Assignee: YAHOO ASSETS LLCInventor: Eric Theodore Bax
-
Patent number: 11315586Abstract: A speech enhancement apparatus is disclosed and comprises an adaptive noise cancellation circuit, a blending circuit, a noise suppressor and a control module. The ANC circuit filters a reference signal to generate a noise estimate and subtracts a noise estimate from a primary signal to generate a signal estimate based on a control signal. The blending circuit blends the primary signal and the signal estimate to produce a blended signal. The noise suppressor suppresses noise over the blended signal using a first trained model to generate an enhanced signal and a main spectral representation from a main microphone and M auxiliary spectral representations from M auxiliary microphones using (M+1) second trained models to generate a main score and M auxiliary scores. The ANC circuit, the noise suppressor and the trained models are well combined to maximize the performance of the speech enhancement apparatus.Type: GrantFiled: September 30, 2020Date of Patent: April 26, 2022Assignee: British Cayman Islands Intelligo Technology Inc.Inventors: Bing-Han Huang, Chun-Ming Huang, Te-Lung Kung, Hsin-Te Hwang, Yao-Chun Liu, Chen-Chu Hsu, Tsung-Liang Chen