Patents Examined by Bryan S Blankenagel
  • Patent number: 11494643
    Abstract: A noise data artificial intelligence learning method for identifying the source of problematic noise may include a noise data pre-conditioning method for identifying the source of problematic noise including: selecting a unit frame for the problematic noise among noises sampled with time; dividing the unit frame into N segments; analyzing frequency characteristic for each segment of the N segments and extracting a frequency component of each segment by applying Log Mel Filter; and outputting a feature parameter as one representative frame by averaging information on the N segments, wherein an artificial intelligence learning by the feature parameter extracted according to a change in time by the noise data pre-conditioning method applies Bidirectional RNN.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: November 8, 2022
    Assignees: Hyundai Motor Company, Kia Motors Corporation, IUCF-HYU (Industry-University Corporation Foundation Hanyang University)
    Inventors: Dong-Chul Lee, In-Soo Jung, Joon-Hyuk Chang, Kyoung-Jin Noh
  • Patent number: 11482244
    Abstract: A method includes receiving an overlapped audio signal that includes audio spoken by a speaker that overlaps a segment of synthesized playback audio. The method also includes encoding a sequence of characters that correspond to the synthesized playback audio into a text embedding representation. For each character in the sequence of characters, the method also includes generating a respective cancelation probability using the text embedding representation. The cancelation probability indicates a likelihood that the corresponding character is associated with the segment of the synthesized playback audio overlapped by the audio spoken by the speaker in the overlapped audio signal.
    Type: Grant
    Filed: March 11, 2021
    Date of Patent: October 25, 2022
    Assignee: Google LLC
    Inventor: Quan Wang
  • Patent number: 11481563
    Abstract: The present disclosure describes systems, non-transitory computer-readable media, and methods that can generate contextual identifiers indicating context for frames of a video and utilize those contextual identifiers to generate translations of text corresponding to such video frames. By analyzing a digital video file, the disclosed systems can identify video frames corresponding to a scene and a term sequence corresponding to a subset of the video frames. Based on images features of the video frames corresponding to the scene, the disclosed systems can utilize a contextual neural network to generate a contextual identifier (e.g. a contextual tag) indicating context for the video frames. Based on the contextual identifier, the disclosed systems can subsequently apply a translation neural network to generate a translation of the term sequence from a source language to a target language. In some cases, the translation neural network also generates affinity scores for the translation.
    Type: Grant
    Filed: November 8, 2019
    Date of Patent: October 25, 2022
    Assignee: Adobe Inc.
    Inventors: Mahika Wason, Amol Jindal, Ajay Bedi
  • Patent number: 11462231
    Abstract: A system configured to perform low input-output latency noise reduction in a frequency domain is provided. The real-time noise reduction algorithm performs frame by frame processing of a single-channel noisy acoustic signal to estimate a gain function. Accurate noise power estimates are achieved with the help of minimum statistics approach followed by a voice activity detector. The noise power and gain values are smoothed to remove any external artifacts and avoid background noise modulations. The gain values for individual frequency bands are weighted and smoothed to reduce distortion. To obtain distortionless output speech, the system performs curve fitting by separating the frequency bands into multiple groups and applying a Savitzky-Golay filter to each group. The final gain values generated by these filters are multiplied with the noisy speech signal to obtain a clean speech signal.
    Type: Grant
    Filed: November 18, 2020
    Date of Patent: October 4, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Nikhil Shankar, Berkant Tacer
  • Patent number: 11450332
    Abstract: To be able to convert to a voice of the desired attribution. Learning an encoder for, on the basis of parallel data of a sound feature vector series in a conversion-source voice signal and a latent vector series in the conversion-source voice signal, and an attribution label indicating attribution of the conversion-source voice signal, estimating a latent vector series from input of a sound feature vector series and an attribution label, and a decoder for reconfiguring the sound feature vector series from input of the latent vector series and the attribution label.
    Type: Grant
    Filed: February 20, 2019
    Date of Patent: September 20, 2022
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Hirokazu Kameoka, Takuhiro Kaneko, Ko Tanaka, Nobukatsu Hojo
  • Patent number: 11416689
    Abstract: The invention refers to a natural language processing system configured for receiving an input sequence ci of input words (v1, v2, . . . vN) representing a first sequence of words in a natural language of a first text and generating an output sequence of output words (, , . . . ) representing a second sequence of words in a natural language of a second text and modeled by a multinominal topic model, wherein the multinominal topic model is extended by an incorporation of language structures using a deep contextualized Long-Short-Term Memory model.
    Type: Grant
    Filed: March 28, 2019
    Date of Patent: August 16, 2022
    Assignee: SIEMENS AKTIENGESELLSCHAFT
    Inventors: Florian Büttner, Yatin Chaudhary, Pankaj Gupta
  • Patent number: 11416755
    Abstract: The present system and method may generally include organizing the task flow of a virtual agent in a way that is controlled by a set of rules and set of conditional probability distributions. The system and method may include receiving a user utterance including a first task, identifying the first task from the user utterance, and obtaining a set of rules related to the plurality of tasks. The set of rules may determine whether pre-tasks and/or pre-conditions are to be executed before executing the first task. The set of rules may also determine whether post-tasks and/or post-conditions are to be executed after executing the first task. The system and method may include executing the task; running a probabilistic graphical model on the plurality of tasks to determine a second task based on the first task; suggesting to the user the second task; and updating the probabilistic graphical model after a threshold number of runs.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: August 16, 2022
    Assignee: Accenture Global Solutions Limited
    Inventors: Roshni Ramesh Ramnani, Shubhashis Sengupta, Moushumi Mahato, Sukanti Beer
  • Patent number: 11404056
    Abstract: A drone system is configured to capture an audio stream that includes voice commands from an operator, to process the audio stream for identification of the voice commands, and to perform operations based on the identified voice commands. The drone system can identify a particular voice stream in the audio stream as an operator voice, and perform the command recognition with respect to the operator voice to the exclusion of other voice streams present in the audio stream. The drone can include a directional camera that is automatically and continuously focused on the operator to capture a video stream usable in disambiguation of different voice streams captured by the drone.
    Type: Grant
    Filed: June 30, 2017
    Date of Patent: August 2, 2022
    Assignee: Snap Inc.
    Inventors: David Meisenholder, Steven Horowitz
  • Patent number: 11393452
    Abstract: The present invention relates to methods of converting a speech into another speech that sounds more natural. The method includes learning for a target conversion function and a target identifier according to an optimal condition in which the target conversion function and the target identifier compete with each other. The target conversion function converts source speech into target speech. The target identifier identifies whether the converted target speech follows the same distribution as actual target speech. The methods include learning for a source conversion function and a source identifier according to an optimal condition in which the source conversion function and the source identifier compete with each other. The source conversion function converts target speech into source speech, and the source identifier identifies whether the converted source speech follows the same distribution as actual source speech.
    Type: Grant
    Filed: February 20, 2019
    Date of Patent: July 19, 2022
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ko Tanaka, Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo
  • Patent number: 11386920
    Abstract: Assistive technologies are herein provided to assist leaders in engaging one or more group participants using a combination of private data specific to a participant and public data specific to a participant. The system includes: a group bot that has public group data and private group data, a first bot for a first participant that has private data and public data associated with the first participant, and a leader bot for a leader. The leader bot is data interactive with the group bot and the first bot, and can cause the first bot to appropriately serve private data on a permissioned private device of the first participant and to serve public data on a permissioned group output device.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: July 12, 2022
    Assignee: FACET LABS, LLC
    Inventors: Stuart Ogawa, Lindsay Sparks, Koichi Nishimura, Wilfred P. So, Jane W. Chen
  • Patent number: 11380345
    Abstract: Transforming a voice of a speaker to a reference timbre includes converting a first portion of a source signal of the voice of the speaker into a time-frequency domain to obtain a time-frequency signal; obtaining frequency bin means of magnitudes over time of the time-frequency signal; converting the frequency bin magnitude means into a Bark domain to obtain a source frequency response curve (SR), where SR(i) corresponds to magnitude mean of the ith frequency bin; obtaining respective gains of frequency bins of the Bark domain with respect to a reference frequency response curve (Rf); obtaining equalizer parameters using the respective gains of the frequency bins of the Bark domain; and transforming the first portion to the reference timbre using the equalizer parameters.
    Type: Grant
    Filed: October 15, 2020
    Date of Patent: July 5, 2022
    Assignee: Agora Lab, Inc.
    Inventors: Jianyuan Feng, Ruixiang Hang, Linsheng Zhao, Fan Li
  • Patent number: 11380347
    Abstract: In some examples, adaptive speech intelligibility control for speech privacy may include determining, based on background noise at a near-end of a speaker, a noise estimate associated with speech emitted from the speaker, and comparing, by using a specified factor, the noise estimate to a speech level estimate for the speech emitted from the speaker. Adaptive speech intelligibility control for speech privacy may further include determining, based on the comparison, a gain value to be applied to the speaker to produce the speech at a specified level to maintain on-axis intelligibility with respect to the speaker, and applying the gain value to the speaker.
    Type: Grant
    Filed: February 1, 2017
    Date of Patent: July 5, 2022
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Sunil Bharitkar, Wensen Liu, Madhu Sudan Athreya, Richard Sweet
  • Patent number: 11366978
    Abstract: A data recognition method includes: extracting a feature map from input data based on a feature extraction layer of a data recognition model; pooling component vectors from the feature map based on a pooling layer of the data recognition model; and generating an embedding vector by recombining the component vectors based on a combination layer of the data recognition model.
    Type: Grant
    Filed: March 7, 2019
    Date of Patent: June 21, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Insoo Kim, Kyuhong Kim, Chang Kyu Choi
  • Patent number: 11361753
    Abstract: Systems are configured for generating spectrogram data characterized by a voice timbre of a target speaker and a prosody style of source speaker by converting a waveform of source speaker data to phonetic posterior gram (PPG) data, extracting additional prosody features from the source speaker data, and generating a spectrogram based on the PPG data and the extracted prosody features. The systems are configured to utilize/train a machine learning model for generating spectrogram data and for training a neural text-to-speech model with the generated spectrogram data.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: June 14, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shifeng Pan, Lei He, Yulin Li, Sheng Zhao, Chunling Ma
  • Patent number: 11354502
    Abstract: Methods, systems and computer program products for automatic extraction and testing of constraints are provided herein. A computer-implemented method includes obtaining a first set of documents describing constraints and a second set of documents describing properties of entities, building a first dictionary of entity types and a second dictionary of relations among the entity types, extracting constraint triples representing the set of constraints from the first set of documents, and extracting fact triples from the second set of documents utilizing the first dictionary and the second dictionary. The method also includes receiving a query to evaluate whether at least one of the set of constraints is satisfied, determining whether the at least one constraint is satisfied by evaluating a constraint satisfaction formula utilizing the constraint triples and the fact triples, and providing a response to the query that indicates whether the at least one constraint is satisfied.
    Type: Grant
    Filed: December 22, 2020
    Date of Patent: June 7, 2022
    Assignee: International Business Machines Corporation
    Inventors: Sreyash Kenkre, Santosh R. K. Penubothula, Disha Shrivastava, Harish Guruprasad Ramaswamy, Vinayaka Pandit
  • Patent number: 11348591
    Abstract: A speaker identification system and method to identify a speaker based on the speaker's voice is disclosed. In an exemplary embodiment, the speaker identification system comprises a Gaussian Mixture Model (GMM) for speaker accent and dialect identification for a given speech signal input by the speaker and an Artificial Neural Network (ANN) to identify the speaker based on the identified dialect, in which the output of the GMM is input to the ANN.
    Type: Grant
    Filed: September 23, 2021
    Date of Patent: May 31, 2022
    Assignee: King Abdulaziz University
    Inventors: Muhammad Moinuddin, Ubaid M. Al-Saggaf, Shahid Munir Shah, Rizwan Ahmed Khan, Zahraa Ubaid Al-Saggaf
  • Patent number: 11348594
    Abstract: Methods, devices, non-transitory computer-readable medium, and systems are described for compressing audio data. The techniques involve obtaining a sequence of digitized samples of an audio signal, performing a transform using the sequence of digitized samples, to generate a plurality of spectral lines, obtaining a group of spectral lines from the plurality of spectral lines, and quantizing the group of spectral lines to generate a group of quantized values. Quantizing the group of spectral lines to generate the group of quantized values may comprise performing a specialized rounding operation on a spectral line selected from the group of spectral lines and using the specialized rounding operation to force a group parity value, computed for the group of quantized values, to a predetermined parity value. One or more data frames based on the group of quantized values may be outputted.
    Type: Grant
    Filed: June 11, 2020
    Date of Patent: May 31, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: Richard Turner, Megan Lucy Taggart, Laurent Wojcieszak, Justin Hundt
  • Patent number: 11322162
    Abstract: A method, a computer-readable medium, and an apparatus for resampling audio signal are provided. The apparatus resamples the audio signal in order to preserve the audio playback quality when dealing with audio playback overrun and underrun problem. The apparatus may receive a data block of the audio signal including a first number of samples. For each sample of the first number of samples, the apparatus may slice a portion of the audio signal corresponding to the sample into a particular number of sub-samples. The apparatus may resample the data block of the audio signal into a second number of samples based on the first number of samples and the particular number of sub-samples associated with each sample of the first number of samples. The apparatus may play back the resampled data block of the audio signal via an electroacoustic device.
    Type: Grant
    Filed: November 1, 2017
    Date of Patent: May 3, 2022
    Assignee: RAZER (ASIA-PACIFIC) PTE. LTD.
    Inventor: Kah Yong Lee
  • Patent number: 11314937
    Abstract: One or more computing devices, systems, and/or methods for controlling a graphical user interface to present a representation of an item in messages are provided. For example, a message may be received from a first device. The message may be analyzed to identify an item of the message. A database of known items may be analyzed to determine whether the item is in the database of known items. Responsive to determining that the item is not in the database of known items, a database of representations may be analyzed to determine a representation of the item. A graphical user interface of a second device may be controlled to present a representation of the item.
    Type: Grant
    Filed: December 14, 2017
    Date of Patent: April 26, 2022
    Assignee: YAHOO ASSETS LLC
    Inventor: Eric Theodore Bax
  • Patent number: 11315586
    Abstract: A speech enhancement apparatus is disclosed and comprises an adaptive noise cancellation circuit, a blending circuit, a noise suppressor and a control module. The ANC circuit filters a reference signal to generate a noise estimate and subtracts a noise estimate from a primary signal to generate a signal estimate based on a control signal. The blending circuit blends the primary signal and the signal estimate to produce a blended signal. The noise suppressor suppresses noise over the blended signal using a first trained model to generate an enhanced signal and a main spectral representation from a main microphone and M auxiliary spectral representations from M auxiliary microphones using (M+1) second trained models to generate a main score and M auxiliary scores. The ANC circuit, the noise suppressor and the trained models are well combined to maximize the performance of the speech enhancement apparatus.
    Type: Grant
    Filed: September 30, 2020
    Date of Patent: April 26, 2022
    Assignee: British Cayman Islands Intelligo Technology Inc.
    Inventors: Bing-Han Huang, Chun-Ming Huang, Te-Lung Kung, Hsin-Te Hwang, Yao-Chun Liu, Chen-Chu Hsu, Tsung-Liang Chen