Pitch Patents (Class 704/207)
  • Patent number: 11900954
    Abstract: A voice processing method includes: determining a historical voice frame corresponding to a target voice frame; determining a frequency-domain characteristic of the historical voice frame; invoking a network model to predict the frequency-domain characteristic of the historical voice frame, to obtain a parameter set of the target voice frame, the parameter set including a plurality of types of parameters, the network model including a plurality of neural networks (NNs), and a number of the types of the parameters in the parameter set being determined according to a number of the NNs; and reconstructing the target voice frame according to the parameter set.
    Type: Grant
    Filed: March 24, 2022
    Date of Patent: February 13, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Wei Xiao, Meng Wang, Shidong Shang, Zurong Wu
  • Patent number: 11900938
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for handing off a user conversation between computer-implemented agents. One of the methods includes receiving, by a computer-implemented agent specific to a user device, a digital representation of speech encoding an utterance, determining, by the computer-implemented agent, that the utterance specifies a requirement to establish a communication with another computer-implemented agent, and establishing, by the computer-implemented agent, a communication between the other computer-implemented agent and the user device.
    Type: Grant
    Filed: July 18, 2022
    Date of Patent: February 13, 2024
    Assignee: GOOGLE LLC
    Inventors: Johnny Chen, Thomas L. Dean, Qiangfeng Peter Lau, Sudeep Gandhe, Gabriel Schine
  • Patent number: 11900904
    Abstract: Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.
    Type: Grant
    Filed: February 14, 2022
    Date of Patent: February 13, 2024
    Assignee: Smule, Inc.
    Inventors: Stefan Sullivan, John Shimmin, Dean Schaffer, Perry R. Cook
  • Patent number: 11887578
    Abstract: A method and system for automatic dubbing method is disclosed, comprising, responsive to receiving a selection of media content for playback on a user device by a user of the user device, processing extracted speeches of a first voice from the media content to generate replacement speeches using a set of phenomes of a second voice of the user of the user device, and replacing the extracted speeches of the first voice with the generated replacement speeches in the audio portion of the media content for playback on the user device.
    Type: Grant
    Filed: November 10, 2022
    Date of Patent: January 30, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Henry Gabryjelski, Jian Luan, Dapeng Li
  • Patent number: 11887622
    Abstract: The present disclosure generally relates to a system and method for obtaining a diagnosis of a mental health condition. An exemplary system can receive an audio input; convert the audio input into a text string; identify a speaker associated with the text string; based on at least a portion of the audio input, determine a predefined audio characteristic of a plurality of predefined audio characteristics; based on the determined audio characteristic, identify an emotion corresponding to the portion of the audio input; generate a set of structured data based on the text string, the speaker, the predefined audio characteristic, and the identified emotion; and provide an output for obtaining the diagnosis of the mental disorder or condition, wherein the output is indicative of at least a portion of the set of structured data.
    Type: Grant
    Filed: September 12, 2019
    Date of Patent: January 30, 2024
    Assignee: United States Department of Veteran Affairs
    Inventors: Qian Hu, Brian P. Marx, Patricia D. King, Seth-David Donald Dworman, Matthew E. Coarr, Keith A. Crouch, Stelios Melachrinoudis, Cheryl Clark, Terence M. Keane
  • Patent number: 11848021
    Abstract: An envelope sequence is provided that can improve approximation accuracy near peaks caused by the pitch period of an audio signal. A periodic-combined-envelope-sequence generation device according to the present invention takes, as an input audio signal, a time-domain audio digital signal in each frame, which is a predetermined time segment, and generates a periodic combined envelope sequence as an envelope sequence. The periodic-combined-envelope-sequence generation device according to the present invention comprises at least a spectral-envelope-sequence calculating part and a periodic-combined-envelope generating part. The spectral-envelope-sequence calculating part calculates a spectral envelope sequence of the input audio signal on the basis of time-domain linear prediction of the input audio signal.
    Type: Grant
    Filed: September 29, 2022
    Date of Patent: December 19, 2023
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada
  • Patent number: 11842719
    Abstract: A sound processing method obtains note data representative of a note; obtains an audio signal to be processed; specifies, in accordance with the note, an expression sample representative of a sound expression to be imparted to the note and an expression period, of the audio signal, to which the sound expression is to be imparted to the note; and specifies, in accordance with the expression sample and the expression period, a processing parameter relating to an expression imparting processing for imparting the sound expression to a portion corresponding to the expression period in the audio signal. The method then generates a processed audio signal by performing the expression imparting processing in accordance with the expression sample, the expression period, and the processing parameter to the audio signal.
    Type: Grant
    Filed: September 21, 2020
    Date of Patent: December 12, 2023
    Assignee: YAMAHA CORPORATION
    Inventors: Merlijn Blaauw, Jordi Bonada, Ryunosuke Daido, Yuji Hisaminato
  • Patent number: 11816577
    Abstract: Generally, the present disclosure is directed to systems and methods that generate augmented training data for machine-learned models via application of one or more augmentation techniques to audiographic images that visually represent audio signals. In particular, the present disclosure provides a number of novel augmentation operations which can be performed directly upon the audiographic image (e.g., as opposed to the raw audio data) to generate augmented training data that results in improved model performance. As an example, the audiographic images can be or include one or more spectrograms or filter bank sequences.
    Type: Grant
    Filed: September 28, 2021
    Date of Patent: November 14, 2023
    Assignee: GOOGLE LLC
    Inventors: Daniel Sung-Joon Park, Quoc Le, William Chan, Ekin Dogus Cubuk, Barret Zoph, Yu Zhang, Chung-Cheng Chiu
  • Patent number: 11763796
    Abstract: A computer-implemented method for speech synthesis, a computer device, and a non-transitory computer readable storage medium are provided. The method includes: obtaining a speech text to be synthesized; obtaining a Mel spectrum corresponding to the speech text to be synthesized according to the speech text to be synthesized; inputting the Mel spectrum into a complex neural network, and obtaining a complex spectrum corresponding to the speech text to be synthesized, wherein the complex spectrum comprises real component information and imaginary component information; and obtaining a synthetic speech corresponding to the speech text to be synthesized, according to the complex spectrum. The method can efficiently and simply complete speech synthesis.
    Type: Grant
    Filed: December 10, 2020
    Date of Patent: September 19, 2023
    Assignee: UBTECH ROBOTICS CORP LTD
    Inventors: Dongyan Huang, Leyuan Sheng, Youjun Xiong
  • Patent number: 11756530
    Abstract: Example embodiments relate to techniques for training artificial neural networks or oilier machine-learning encoders to accurately predict the pitch of input audio samples in a semitone or otherwise logarithmically-scaled pitch space. An example method may include generating, from a sample of audio data, two training samples by applying two different pitch shifts to the sample of audio training data. This can be done by converting the sample of audio data into the frequency domain and then shifting the transformed data. These known shifts are then compared to the predicted pitches generated by applying the two training samples to the encoder. The encoder is then updated based on the comparison, such that the relative pitch output by the encoder is improved with respect to accuracy. One or more audio samples, labeled with absolute pitch values, can then be used to calibrate the relative pitch values generated by the trained encoder.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: September 12, 2023
    Assignee: Google LLC
    Inventors: Marco Tagliasacchi, Mihajlo Velimirovic, Matthew Sharifi, Dominik Roblek, Christian Frank, Beat Gfeller
  • Patent number: 11749290
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing long-term prediction (LTP) are described. One example of the methods includes determining a pitch gain and a pitch lag of an input audio signal for at least a predetermined number of frames. It is determined that the pitch gain of the input audio signal has exceeded a predetermined threshold and that a change of the pitch lag of the input audio signal has been within a predetermined range for at least the predetermined number of frames. In response to determining that the pitch gain of the input audio signal has exceeded the predetermined threshold and that the change of the third pitch lag has been within the predetermined range for at least the predetermined number of frames, a pitch gain is set for a current frame of the input audio signal.
    Type: Grant
    Filed: July 12, 2021
    Date of Patent: September 5, 2023
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventor: Yang Gao
  • Patent number: 11727922
    Abstract: A computerized system for deriving expression of intent from recorded speech includes: a text classification module comparing a transcription of recorded speech against a text classifier to generate a first set of representations of potential intents; a phonetics classification module comparing a phonetic transcription of the recorded speech against a phonetics classifier to generate a second set of representations; an audio classification module comparing an audio version of the recorded speech with an audio classifier to generate a third set of representations; and a discriminator module for receiving the first, second and third sets of the representations of potential intents and generating one derived expression of intent by processing the first, second and third sets together; where at least two of the text classification module, the phonetics classification module, and the audio classification module are asynchronous processes from one another.
    Type: Grant
    Filed: May 11, 2021
    Date of Patent: August 15, 2023
    Assignee: Verint Americas Inc.
    Inventor: Moshe Villaizan
  • Patent number: 11711648
    Abstract: Techniques are provided for audio-based detection and tracking of an acoustic source. A methodology implementing the techniques according to an embodiment includes generating acoustic signal spectra from signals provided by a microphone array, and performing beamforming on the acoustic signal spectra to generate beam signal spectra, using time-frequency masks to reduce noise. The method also includes detecting, by a deep neural network (DNN) classifier, an acoustic event, associated with the acoustic source, in the beam signal spectra. The DNN is trained on acoustic features associated with the acoustic event. The method further includes performing pattern extraction, in response to the detection, to identify time-frequency bins of the acoustic signal spectra that are associated with the acoustic event, and estimating a motion direction of the source relative to the array of microphones based on Doppler frequency shift of the acoustic event calculated from the time-frequency bins of the extracted pattern.
    Type: Grant
    Filed: March 10, 2020
    Date of Patent: July 25, 2023
    Assignee: Intel Corporation
    Inventors: Kuba Lopatka, Adam Kupryjanow, Lukasz Kurylo, Karol Duzinkiewicz, Przemyslaw Maziewski, Marek Zabkiewicz
  • Patent number: 11710492
    Abstract: Methods, systems, and devices for encoding are described. A device, which may be otherwise known as user equipment (UE), may support standards-compatible audio encoding (e.g., speech encoding) using a pre-encoded database. The device may receive a digital representation of an audio signal and identify, based on receiving the digital representation of the audio signal, a database that is pre-encoded according to a coding standard and that includes a quantity of digital representations of other audio signals. The device may encode the digital representation of the audio signal using a machine learning scheme and information from the database pre-encoded according to the coding standard. The device may generate a bitstream of the digital representation that is compatible with the coding standard based on encoding the digital representation of the audio signal, and output a representation of the bitstream.
    Type: Grant
    Filed: October 2, 2019
    Date of Patent: July 25, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Stephane Pierre Villette, Daniel Jared Sinder
  • Patent number: 11640824
    Abstract: Systems, devices, and methods transcribe words recorded in audio data. A computer-generated transcript is provided. The transcript comprises records for each word in the computer-generated transcript. At least one confirmation input is received for each record. The at least one confirmation input modifies a selected record and automatically identifies a next record for receiving a next confirmation input. A sequence of confirmation inputs may rapidly modify and validate each record in a sequence of records in the computer-generated transcript. A validated transcript is generated from the modified records and is provided from an evidence management system.
    Type: Grant
    Filed: July 15, 2020
    Date of Patent: May 2, 2023
    Assignee: Axon Enterprise, Inc.
    Inventors: Noah Spitzer-Williams, Choongyeun Cho, Thomas Crosley, Zachary Charles Goist, Daniel Michael Bellia, Vinh Hein Nguyen, Chelsea Alexander-Taylor
  • Patent number: 11636836
    Abstract: Provided is a method for processing audio including: acquiring an accompaniment audio signal and a voice signal of a current to-be-processed musical composition; determining a target reverberation intensity parameter value of the acquired accompaniment audio signal, wherein the target reverberation intensity parameter value is configured to indicate a rhythm speed, an accompaniment type, and a performance score of a singer of the current to-be-processed musical composition; and reverberating the acquired vocal signal based on the target reverberation intensity parameter value.
    Type: Grant
    Filed: March 23, 2022
    Date of Patent: April 25, 2023
    Assignee: BEIJING DAJIA INTERNET INFORMATION TECHNOLOGY CO., LTD.
    Inventors: Xiguang Zheng, Chen Zhang
  • Patent number: 11621725
    Abstract: A method for partitioning of input vectors for coding is presented. The method comprises obtaining of an input vector. The input vector is segmented, in a non-recursive manner, into an integer number, NSEG, of input vector segments. A representation of a respective relative energy difference between parts of the input vector on each side of each boundary between the input vector segments is determined, in a recursive manner. The input vector segments and the representations of the relative energy differences are provided for individual coding. Partitioning units and computer programs for partitioning of input vectors for coding, as well as positional encoders, are presented.
    Type: Grant
    Filed: January 11, 2022
    Date of Patent: April 4, 2023
    Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)
    Inventors: Tomas Jansson Toftgård, Volodya Grancharov, Jonas Svedberg
  • Patent number: 11610577
    Abstract: Methods and Systems for providing a change to a voice interacting with a user are described. Information indicating a change that can be made to the voice can be received. The voice can be changed based on the information.
    Type: Grant
    Filed: November 19, 2020
    Date of Patent: March 21, 2023
    Assignee: Capital One Services, LLC
    Inventors: Anh Truong, Mark Watson, Jeremy Goodsitt, Vincent Pham, Fardin Abdi Taghi Abad, Kate Key, Austin Walters, Reza Farivar
  • Patent number: 11605377
    Abstract: The dialog device according to the present invention includes a prediction unit 254 configured to predict an utterance length attribute of a user utterance in response to a the machine utterance, a selection unit 256 configured to use the utterance length attribute to select, as a feature model for usage in an end determination of the user utterance, at least one of an acoustic feature model or a lexical feature model, and an estimation unit 258 configured to estimate an end point in the user utterance using the selected model. By using this dialog device, it is possible to shorten the waiting time until a response is output to a user utterance by a machine, and to realize a more natural conversation between a user and a machine.
    Type: Grant
    Filed: March 19, 2020
    Date of Patent: March 14, 2023
    Assignee: HITACHI, LTD.
    Inventors: Amalia Istiqlali Adiba, Takeshi Homma, Takashi Sumiyoshi
  • Patent number: 11553235
    Abstract: Techniques have been developed to facilitate the livestreaming of group audiovisual performances. Audiovisual performances including vocal music are captured and coordinated with performances of other users in ways that can create compelling user and listener experiences. For example, in some cases or embodiments, duets with a host performer may be supported in a sing-with-the-artist style audiovisual livestream in which aspiring vocalists request or queue particular songs for a live radio show entertainment format. The developed techniques provide a communications latency-tolerant mechanism for synchronizing vocal performances captured at geographically-separated devices (e.g., at globally-distributed, but network-connected mobile phones or tablets or at audiovisual capture devices geographically separated from a live studio).
    Type: Grant
    Filed: June 7, 2021
    Date of Patent: January 10, 2023
    Assignee: Smule, Inc.
    Inventors: Anton Holmberg, Benjamin Hersh, Jeannie Yang, Perry R. Cook, Jeffrey C. Smith
  • Patent number: 11545153
    Abstract: Provided is a device, a method that allow a remote terminal to perform a process on the basis of a local-terminal-side user utterance. There are a local terminal and a remote terminal. The local terminal performs a process of a semantic analysis of a user utterance input into the local terminal. On the basis of a result of the semantic analysis, the local terminal determines whether or not the user utterance is a request to the remote terminal for a process. Moreover, in a case where the user utterance is a request to the remote terminal for a process, the local terminal transmits the result of the semantic analysis by a semantic-analysis part to the remote terminal. The remote terminal receives the result of the semantic analysis of the local-terminal-side user utterance, and performs a process based on the received result of the semantic analysis of the local-terminal-side user utterance.
    Type: Grant
    Filed: March 12, 2019
    Date of Patent: January 3, 2023
    Assignee: Sony Corporation
    Inventor: Keiichi Yamada
  • Patent number: 11488620
    Abstract: The present invention is a computer program product and method for increasing the playback speed of audio or other media files. The computer program product and method identifies pedagogic media files and adds a flag to the metadata of the media file. The flag represents the number and type of pauses or silent sections in the pedagogic media file. Based on the flag, the computer program product and method may fast forward or remove a portion of the pauses and silent sections to provide a new playback speed.
    Type: Grant
    Filed: June 12, 2019
    Date of Patent: November 1, 2022
    Assignee: International Business Machines Corporation
    Inventor: Deepa Jain
  • Patent number: 11443761
    Abstract: A technique, suitable for real-time processing, is disclosed for pitch tracking by detection of glottal excitation epochs in speech signal. It uses Hilbert envelope to enhance saliency of the glottal excitation epochs and to reduce the ripples due to the vocal tract filter. The processing comprises the steps of dynamic range compression, calculation of the Hilbert envelope, and epoch marking. The Hilbert envelope is calculated using the output of a FIR filter based Hilbert transformer and the delay-compensated signal. The epoch marking uses a dynamic peak detector with fast rise and slow fall and nonlinear smoothing to further enhance the saliency of the epochs, followed by a differentiator or a Teager energy operator, and amplitude-duration thresholding. The technique is meant for use in speech codecs, voice conversion, speech and speaker recognition, diagnosis of voice disorders, speech training aids, and other applications involving pitch estimation.
    Type: Grant
    Filed: August 3, 2019
    Date of Patent: September 13, 2022
    Inventors: Prem Chand Pandey, Hirak Dasgupta, Nataraj Kathriki Shambulingappa
  • Patent number: 11443751
    Abstract: Innovations in phase quantization during speech encoding and phase reconstruction during speech decoding are described. For example, to encode a set of phase values, a speech encoder omits higher-frequency phase values and/or represents at least some of the phase values as a weighted sum of basis functions. Or, as another example, to decode a set of phase values, a speech decoder reconstructs at least some of the phase values using a weighted sum of basis functions and/or reconstructs lower-frequency phase values then uses at least some of the lower-frequency phase values to synthesize higher-frequency phase values. In many cases, the innovations improve the performance of a speech codec in low bitrate scenarios, even when encoded data is delivered over a network that suffers from insufficient bandwidth or transmission quality problems.
    Type: Grant
    Filed: February 12, 2021
    Date of Patent: September 13, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Soren Skak Jensen, Sriram Srinivasan, Koen Bernard Vos
  • Patent number: 11430461
    Abstract: A method for detecting a voice activity in an input audio signal composed of frames includes that a noise characteristic of the input signal is determined based on a received frame of the input audio signal. A voice activity detection (VAD) parameter is derived based on the noise characteristic of the input audio signal using an adaptive function. The derived VAD parameter is compared with a threshold value to provide a voice activity detection decision. The input audio signal is processed according to the voice activity detection decision.
    Type: Grant
    Filed: September 21, 2020
    Date of Patent: August 30, 2022
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventor: Zhe Wang
  • Patent number: 11423902
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for handing off a user conversation between computer-implemented agents. One of the methods includes receiving, by a computer-implemented agent specific to a user device, a digital representation of speech encoding an utterance, determining, by the computer-implemented agent, that the utterance specifies a requirement to establish a communication with another computer-implemented agent, and establishing, by the computer-implemented agent, a communication between the other computer-implemented agent and the user device.
    Type: Grant
    Filed: July 27, 2020
    Date of Patent: August 23, 2022
    Assignee: GOOGLE LLC
    Inventors: Johnny Chen, Thomas L. Dean, Qiangfeng Peter Lau, Sudeep Gandhe, Gabriel Schine
  • Patent number: 11380351
    Abstract: A method for pulmonary condition monitoring includes selecting a phrase from an utterance of a user of an electronic device, wherein the phrase matches an entry of multiple phrases. At least one speech feature that is associated with one or more pulmonary conditions within the phrase is identified. A pulmonary condition is determined based on analysis of the at least one speech feature.
    Type: Grant
    Filed: January 14, 2019
    Date of Patent: July 5, 2022
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ebrahim Nematihosseinabadi, Md Mahbubur Rahman, Viswam Nathan, Korosh Vatanparvar, Jilong Kuang, Jun Gao
  • Patent number: 11348591
    Abstract: A speaker identification system and method to identify a speaker based on the speaker's voice is disclosed. In an exemplary embodiment, the speaker identification system comprises a Gaussian Mixture Model (GMM) for speaker accent and dialect identification for a given speech signal input by the speaker and an Artificial Neural Network (ANN) to identify the speaker based on the identified dialect, in which the output of the GMM is input to the ANN.
    Type: Grant
    Filed: September 23, 2021
    Date of Patent: May 31, 2022
    Assignee: King Abdulaziz University
    Inventors: Muhammad Moinuddin, Ubaid M. Al-Saggaf, Shahid Munir Shah, Rizwan Ahmed Khan, Zahraa Ubaid Al-Saggaf
  • Patent number: 11303489
    Abstract: A transmitting apparatus includes a first signal generating unit that generates, on the basis of data a first signal transmitted by single carrier block transmission; a second signal generating unit that generates, on the basis of an RS, a second signal transmitted by orthogonal frequency division multiplex transmission; a switching operator that selects and outputs the second signal in a first transmission period and selects and outputs the first signal in a second transmission period; an antenna that transmits the signal output from the switching operator; and a control-signal generating unit that controls the second signal generating unit such that, in the first transmission period, the RS is arranged in a frequency band allocated for transmission of the RS from the transmitting apparatus among frequency bands usable in OFDM.
    Type: Grant
    Filed: January 30, 2020
    Date of Patent: April 12, 2022
    Assignee: Mitsubishi Electric Corporation
    Inventors: Fumihiro Hasegawa, Akinori Taira
  • Patent number: 11289067
    Abstract: Methods and systems for generating voices based on characteristics of an avatar. One or more characteristics of an avatar are obtained and one or more parameters of a voice synthesizer for generating a voice corresponding to the one or more avatar characteristics are determined. The voice synthesizer is configured based on the one or more parameters and a voice is generated using the parameterized voice synthesizer.
    Type: Grant
    Filed: June 25, 2019
    Date of Patent: March 29, 2022
    Assignee: International Business Machines Corporation
    Inventors: Kristina Marie Brimijoin, Gregory Boland, Joseph Schwarz
  • Patent number: 11282534
    Abstract: Systems and methods for intelligent playback of media content may include an intelligent media playback system that, in response to determining the speech tempo in audio content by measuring syllable density of speech in the audio content, automatically adjusts a playback speed of the audio content as the audio content is being played based on the determined speech tempo. In some embodiments, the system may automatically and dynamically adjust the playback speed to result in a desired target speech tempo. In addition, the system may determine whether to automatically adjust playback speed of the audio content, as the media is being played, based on the detected speech tempo of the speech in the audio content and the determined type of content of media. Such automatic adjustments in playback speed result in more efficient playback of the audio content.
    Type: Grant
    Filed: August 3, 2018
    Date of Patent: March 22, 2022
    Assignee: Sling Media PVT Ltd
    Inventors: Yatish Jayant Naik Raikar, Varunkumar Tripathi, Karthik Mahabaleshwar Hegde
  • Patent number: 11276412
    Abstract: A method and device allocates a bit-budget to a plurality of first parts of a CELP core module of (a) an encoder for encoding a sound signal or (b) a decoder for decoding the sound signal. In the method and device, bit-budget allocation tables assign, for each of a plurality of intermediate bit rates, respective bit-budgets to the first CELP core module parts. A CELP core module bit rate is determined and one of the intermediate bit rates is selected based on the determined CELP core module bit rate. The respective bit-budgets assigned by the bit-budget allocation tables for the selected intermediate bit rate are allocated to the first CELP core module parts.
    Type: Grant
    Filed: September 20, 2018
    Date of Patent: March 15, 2022
    Assignee: VOICEAGE CORPORATION
    Inventor: Vaclav Eksler
  • Patent number: 11270071
    Abstract: Systems, apparatuses, and methods are described herein for providing language-level content recommendations to users based on an analysis of closed captions of content viewed by the users and other data. Language-level analysis of content viewed by a user may be performed to generate metrics that are associated with the user. The metrics may be used to provide recommendations for content, which may include advertising, that is closely aligned with the user's interests.
    Type: Grant
    Filed: December 28, 2017
    Date of Patent: March 8, 2022
    Assignee: Comcast Cable Communications, LLC
    Inventor: Richard Walsh
  • Patent number: 11270721
    Abstract: Pre-processing systems, methods of pre-processing, and speech processing systems for improved Automated Speech Recognition are provided. Some pre-processing systems for improved speech recognition of a speech signal are provided, which systems comprise a pitch estimation circuit; and a pitch equalization processor. The pitch estimation circuit is configured to receive the speech signal to determine a pitch index of the speech signal, and the pitch equalization processor is configured to receive the speech signal and pitch information, to equalize a speech pitch of the speech signal using the pitch information, and to provide a pitch-equalized speech signal.
    Type: Grant
    Filed: May 21, 2019
    Date of Patent: March 8, 2022
    Assignee: PLANTRONICS, INC.
    Inventors: Youhong Lu, Arun Rajasekaran
  • Patent number: 11263876
    Abstract: Data is collected for Self-Service Terminals (SSTs) including tallies, events, and outcomes associated with servicing the SSTs. Statistical correlations are derived from the tallies and events with respect to the outcomes. Subsequent collected data is processed with the statistical correlations and a probability for a failure of a component or a part of the component associated with a particular SST is reported for servicing the component or part before the failure.
    Type: Grant
    Filed: September 28, 2017
    Date of Patent: March 1, 2022
    Assignee: NCR Corporation
    Inventors: Claudio Cifarelli, Gardiner Arthur, Iain M. N. Cowan, Massimo Mastropietro, Callum Ellis Morton
  • Patent number: 11250221
    Abstract: Methods, systems, and computer-readable storage media for contextual interpretation of a Japanese word are provided. A first set of characters representing Japanese words is received. The first set of characters are received is input to a neural network. The neural network is trained to processes characters based on bi-directional context interpretation. The first set of characters is processed by the neural network through a plurality of learning layers that process the first set of characters in an order of the first set of characters and in a reverse order to determine semantical meanings of the characters in the first set of characters. An alphabet representation of at least one character of the first set of characters representing a Japanese word is output. The alphabet representation corresponds to a semantical meaning of the at least one character within the first set of characters.
    Type: Grant
    Filed: March 14, 2019
    Date of Patent: February 15, 2022
    Assignee: SAP SE
    Inventor: Sean Saito
  • Patent number: 11250826
    Abstract: Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: February 15, 2022
    Assignee: Smule, Inc.
    Inventors: Stefan Sullivan, John Shimmin, Dean Schaffer, Perry R. Cook
  • Patent number: 11241635
    Abstract: The present disclosure according to at least one embodiment relates to, in the learning process of a child using smart toys, a method and system for providing an interactive service by using a smart toy, which provide more accurate classified emotional state of the child based on at least one or more sensed data items of an optical image, a thermal image, and voice data of the child, as well as adaptively provide a flexible and versatile interactive service according to classified emotions.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: February 8, 2022
    Inventors: Heui Yul Noh, Myeong Ho Roh, Chang Woo Ban, Oh Soung Kwon, Seung Pil Lee, Seung Min Shin
  • Patent number: 11244694
    Abstract: A method is described that processes an audio signal. A discontinuity between a filtered past frame and a filtered current frame of the audio signal is removed using linear predictive filtering.
    Type: Grant
    Filed: January 23, 2017
    Date of Patent: February 8, 2022
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Emmanuel Ravelli, Manuel Jander, Grzegorz Pietrzyk, Martin Dietz, Marc Gayer
  • Patent number: 11239859
    Abstract: A method for partitioning of input vectors for coding is presented. The method comprises obtaining of an input vector. The input vector is segmented, in a non-recursive manner, into an integer number, NSEG, of input vector segments. A representation of a respective relative energy difference between parts of the input vector on each side of each boundary between the input vector segments is determined, in a recursive manner. The input vector segments and the representations of the relative energy differences are provided for individual coding. Partitioning units and computer programs for partitioning of input vectors for coding, as well as positional encoders, are presented.
    Type: Grant
    Filed: June 5, 2020
    Date of Patent: February 1, 2022
    Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)
    Inventors: Tomas Jansson Toftgård, Volodya Grancharov, Jonas Svedberg
  • Patent number: 11227586
    Abstract: Systems and methods improving the performance of statistical model-based single-channel speech enhancement systems using a deep neural network (DNN) are disclosed. Embodiments include a DNN-trained system to predict speech presence in the input signal, and this information can be used to create frameworks for tracking noise and conducting a priori signal to-noise ratio estimation. Example frameworks provide increased flexibility for various aspects of system design, such as gain estimation. Examples include training a DNN to detect speech in the presence of both noise and reverberation, enabling joint suppression of additive noise and reverberation. Example frameworks provide significant improvements in objective speech quality metrics relative to baseline systems.
    Type: Grant
    Filed: September 11, 2019
    Date of Patent: January 18, 2022
    Assignee: Massachusetts Institute of Technology
    Inventors: Bengt J. Borgstrom, Michael S. Brandstein, Robert B. Dunn
  • Patent number: 11217237
    Abstract: At least one exemplary embodiment is directed to a method and device for voice operated control with learning. The method can include measuring a first sound received from a first microphone, measuring a second sound received from a second microphone, detecting a spoken voice based on an analysis of measurements taken at the first and second microphone, learning from the analysis when the user is speaking and a speaking level in noisy environments, training a decision unit from the learning to be robust to a detection of the spoken voice in the noisy environments, mixing the first sound and the second sound to produce a mixed signal, and controlling the production of the mixed signal based on the learning of one or more aspects of the spoken voice and ambient sounds in the noisy environments.
    Type: Grant
    Filed: November 13, 2018
    Date of Patent: January 4, 2022
    Assignee: Staton Techiya, LLC
    Inventors: John Usher, Steven Goldstein, Marc Boillot
  • Patent number: 11216853
    Abstract: A method and system for advertising dynamic content in an immersive digital medium user experience operate a plurality of computer processors and databases in an associated network for receiving, processing, and communicate instructions and data relating to advertising content in an immersive digital medium user experience. The method and system execute instructions and processing data relating to advertising objects, the advertising objects comprising images of objects, signs, labels, and related indicia of object origin for indicating sources of purchasing one or more objects for advertising to receive advertising instructions and data from a plurality of software applications and further respond to variations in said advertising instructions and data whereby operation of said computer processors and databases enables swapping out of various advertising messages and images according to the context of said immersive digital medium user experience.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: January 4, 2022
    Inventor: Quintan Ian Pribyl
  • Patent number: 11159589
    Abstract: Described are a system, method, and computer program product for task-based teleconference management. The method includes initiating a teleconference bridge and generating a teleconference session hosted by the bridge. The method also includes connecting teleconference participants of an organization to the bridge and receiving a participant identifier for each participant. The method further includes determining an association of an organization group with each participant based on a respective participant identifier. The method further includes generating display data configured to cause a computing device to display a control interface depicting: (i) the teleconference session having groups of participants, the groups selected from predetermined groups based on task data, and each participant visually associated which its group; and (ii) labels of each participant to identify the group associated therewith.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: October 26, 2021
    Assignee: Visa International Service Association
    Inventors: Yi Shen, Trinath Anaparthi, Sangram Pattanaik
  • Patent number: 11134330
    Abstract: Embodiments of the invention determine a speech estimate using a bone conduction sensor or accelerometer, without employing voice activity detection gating of speech estimation. Speech estimation is based either exclusively on the bone conduction signal, or is performed in combination with a microphone signal. The speech estimate is then used to condition an output signal of the microphone. There are multiple use cases for speech processing in audio devices.
    Type: Grant
    Filed: July 12, 2019
    Date of Patent: September 28, 2021
    Assignee: Cirrus Logic, Inc.
    Inventors: David Leigh Watts, Brenton Robert Steele, Thomas Ivan Harvey, Vitaliy Sapozhnykov
  • Patent number: 11127416
    Abstract: A method and an apparatus for voice activity detection provided in embodiments of the present disclosure allow for dividing a to-be-detected audio file into frames to obtain a first sequence of audio frames, extracting an acoustic features of each audio frame in the first sequence of audio frames, and then inputting the acoustic feature of each audio frame to a noise-added VAD model in chronological order to obtain a probability value of each audio frame in the first sequence of audio frames; and then determining, by an electronic device, a start and an end of the voice signal according to the probability value of each audio frame. During the VAD detection, the start and the end of a voice signal in an audio are recognized with a noise-added VAD model to realize the purpose of accurately recognizing the start and the end of the voice signal.
    Type: Grant
    Filed: September 6, 2019
    Date of Patent: September 21, 2021
    Inventor: Chao Li
  • Patent number: 11094328
    Abstract: Various embodiments herein each include at least one of systems, methods, and software for conference audio manipulation for inclusion and accessibility. One embodiment, in the form of a method that may be performed, for example, on a server or a participant computing device. This method includes receiving a voice signal via a network and modifying an audible characteristic of the voice signal that is perceptible when the voice signal is audibly output. The method further includes outputting the voice signal including the modified audible characteristic.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: August 17, 2021
    Assignee: NCR Corporation
    Inventor: Phil Noel Day
  • Patent number: 11087231
    Abstract: This disclosure is directed to an apparatus for intelligent matching of disparate input data received from disparate input data systems in a complex computing network for establishing targeted communication to a computing device associated with the intelligently matched disparate input data.
    Type: Grant
    Filed: December 9, 2019
    Date of Patent: August 10, 2021
    Assignee: Research Now Group, LLC
    Inventors: Melanie D. Courtright, Vincent P. Derobertis, Michael D. Bigby, William C. Robinson, Greg Ellis, Heidi D. E. Wilton, John R. Rothwell, Jeremy S. Antoniuk
  • Patent number: 11062094
    Abstract: A method of analyzing sentiments includes receiving one or more strings of text, identifying sentiments related to a first topic from the one or more strings of text, and assigning a sentiment score to each of the sentiments related to the first topic, where the sentiment score corresponds to a degree of positivity or negativity of a sentiment of the sentiments. The method further includes calculating an average sentiment score for the first topic based on the sentiment score for each of the sentiments related to the first topic, determining a percentile for the first topic based on a frequency of sentiments related to the first topic, where the percentile for the first topic is determined with respect to a maximum frequency of sentiments related to one or more other topics, and computing an X-Score based on the average sentiment score and the percentile of the first topic.
    Type: Grant
    Filed: May 7, 2019
    Date of Patent: July 13, 2021
    Assignee: LANGUAGE LOGIC, LLC
    Inventors: Rick Kieser, Charles Baylis, Serge Luyens
  • Patent number: 11049492
    Abstract: Described herein are real-time musical translation devices (RETM) and methods of use thereof. Exemplary uses of RETMs include optimizing the understanding and/or recall of an input message for a user and improving a cognitive process in a user.
    Type: Grant
    Filed: November 10, 2020
    Date of Patent: June 29, 2021
    Assignee: YAO THE BARD, LLC
    Inventors: Leonardus H. T. Van Der Ploeg, Halley Young