Interpolation Patents (Class 704/265)
  • Patent number: 12141689
    Abstract: Systems and methods for generating a representative value of a data set by first compressing a portion of values in the data set to determine a first common value and further compressing a subset of the portion of values to determine a second common value. The representative value is generated by taking the difference between the first common value and the second common value, wherein the representative value corresponds to a mathematical relationship between the first and second common values and each value within the subset of the portion of values. The representative value requires less storage than the first and second common values.
    Type: Grant
    Filed: March 18, 2019
    Date of Patent: November 12, 2024
    Assignee: NVIDIA Corporation
    Inventor: David Rigel Garcia Garcia
  • Patent number: 12106330
    Abstract: Method and system for generation of audio clip replicating the voice of a human speaker that may be dynamically inserted as an audio clip in digitally requested media files, such as podcasts, streams and broadcasts. Using a sample of speech from a previously-recorded audio file, a streaming audio source or a broadcast, a text-to-speech synthesis engine mimicking or cloning the voice present in the audio input is used to generate novel audio clip which is inserted in the requested media file.
    Type: Grant
    Filed: November 11, 2021
    Date of Patent: October 1, 2024
    Inventors: Alberto Betella, Benjamin Richardson
  • Patent number: 12086564
    Abstract: A system and method for masking an identity of a speaker of natural language speech, such as speech clips to be labeled by humans in a system generating voice transcriptions for training an automatic speech recognition model. The natural language speech is morphed prior to being presented to the human for labeling. In one embodiment, morphing comprises pitch shifting the speech randomly either up or down, then frequency shifting the speech, then pitch shifting the speech in a direction opposite the first pitch shift. Labeling the morphed speech comprises at least one or more of transcribing the morphed speech, identifying a gender of the speaker, identifying an accent of the speaker, and identifying a noise type of the morphed speech.
    Type: Grant
    Filed: November 30, 2021
    Date of Patent: September 10, 2024
    Assignee: SoundHound AI IP, LLC.
    Inventor: Dylan H. Ross
  • Patent number: 12080310
    Abstract: An audio encoder for encoding an audio signal has: a first encoding processor for encoding a first audio signal portion in a frequency domain, having: a time frequency converter for converting the first audio signal portion into a frequency domain representation; an analyzer for analyzing the frequency domain representation to determine first spectral portions to be encoded with a first spectral resolution and second regions to be encoded with a second resolution; and a spectral encoder for encoding the first spectral portions with the first spectral resolution and encoding the second portions with the second resolution; a second encoding processor for encoding a second different audio signal portion in the time domain; a controller for analyzing and determining, which portion of the audio signal is the first audio signal portion encoded in the frequency domain and which portion is the second audio signal portion encoded in the time domain; and an encoded signal former for forming an encoded audio signal havi
    Type: Grant
    Filed: June 1, 2021
    Date of Patent: September 3, 2024
    Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Martin Dietz, Markus Multrus, Guillaume Fuchs, Emmanuel Ravelli, Matthias Neusinger, Markus Schnell, Benjamin Schubert, Bernhard Grill
  • Patent number: 11929084
    Abstract: An audio encoder for encoding an audio signal has: a first encoding processor for encoding a first audio signal portion in a frequency domain, having: a time frequency converter for converting the first audio signal portion into a frequency domain representation; an analyzer for analyzing the frequency domain representation to determine first spectral portions to be encoded with a first spectral resolution and second regions to be encoded with a second resolution; and a spectral encoder for encoding the first spectral portions with the first spectral resolution and encoding the second portions with the second resolution; a second encoding processor for encoding a second different audio signal portion in the time domain; a controller for analyzing and determining, which portion of the audio signal is the first audio signal portion encoded in the frequency domain and which portion is the second audio signal portion encoded in the time domain; and an encoded signal former for forming an encoded audio signal havi
    Type: Grant
    Filed: January 23, 2023
    Date of Patent: March 12, 2024
    Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Martin Dietz, Markus Multrus, Guillaume Fuchs, Emmanuel Ravelli, Matthias Neusinger, Markus Schnell, Benjamin Schubert, Bernhard Grill
  • Patent number: 11847726
    Abstract: A method for outputting a blend shape value includes: performing feature extraction on obtained target audio data to obtain a target audio feature vector; inputting the target audio feature vector and a target identifier into an audio-driven animation model; inputting the target audio feature vector into an audio encoding layer, determining an input feature vector of a next layer at a (2t?n)/2 time point based on an input feature vector of a previous layer between a t time point and a t?n time point, determining a feature vector having a causal relationship with the input feature vector of the previous layer as a valid feature vector, outputting sequentially target-audio encoding features, and inputting the target identifier into a one-hot encoding layer for binary vector encoding to obtain a target-identifier encoding feature; and outputting a blend shape value corresponding to the target audio data.
    Type: Grant
    Filed: July 22, 2022
    Date of Patent: December 19, 2023
    Assignee: Nanjing Silicon Intelligence Technology Co., Ltd.
    Inventors: Huapeng Sima, Cuicui Tang, Zheng Liao
  • Patent number: 11830481
    Abstract: Methods are performed by one or more processing devices for correcting prosody in audio data. A method includes operations for accessing subject audio data in an audio edit region of the audio data. The subject audio data in the audio edit region potentially lacks prosodic continuity with unedited audio data in an unedited audio portion of the audio data. The operations further include predicting, based on a context of the unedited audio data, phoneme durations including a respective phoneme duration of each phoneme in the unedited audio data. The operations further include predicting, based on the context of the unedited audio data, a pitch contour comprising at least one respective pitch value of each phoneme in the unedited audio data. Additionally, the operations include correcting prosody of the subject audio data in the audio edit region by applying the phoneme durations and the pitch contour to the subject audio data.
    Type: Grant
    Filed: November 30, 2021
    Date of Patent: November 28, 2023
    Assignee: Adobe Inc.
    Inventors: Maxwell Morrison, Zeyu Jin, Nicholas Bryan, Juan Pablo Caceres Chomali, Lucas Rencker
  • Patent number: 11714788
    Abstract: According to an embodiment, a method of building a database in which voice signals match texts comprises providing a captcha-purposed voice signal including a first voice signal matched with a first text and a second voice signal matched with no text, sending a request for a first input text and a second input text for the captcha-purposed voice signal, when the first input text and the second input text are received, comparing the first text with the first input text, and when the first text is identical to the first input text, matching the second voice signal with the second input text and storing the match. Embodiments of the present invention may be related to artificial intelligence (Al) modules, unmanned aerial vehicles (UAVs), robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.
    Type: Grant
    Filed: September 17, 2019
    Date of Patent: August 1, 2023
    Assignee: LG ELECTRONICS INC.
    Inventor: Dami Kim
  • Patent number: 11537798
    Abstract: Embodiments of the present disclosure relate to a method and apparatus for generating a dialogue model. The method may include: acquiring a corpus sample set, a corpus sample including input information and target response information; classifying corpus samples in the corpus sample set, setting discrete hidden variables for the corpus samples based on a classification result to generate a training sample set, a training sample including the input information, the target response information, and a discrete hidden variable; and training a preset neural network using the training sample set to obtain the dialogue model, the dialogue model being used to represent a corresponding relationship between inputted input information and outputted target response information.
    Type: Grant
    Filed: June 8, 2020
    Date of Patent: December 27, 2022
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Siqi Bao, Huang He, Junkun Chen, Fan Wang, Hua Wu, Jingzhou He
  • Patent number: 11538455
    Abstract: Computer-implemented methods for speech synthesis are provided. A speech synthesizer may be trained to generate synthesized audio data that corresponds to words uttered by a source speaker according to speech characteristics of a target speaker. The speech synthesizer may be trained by time-stamped phoneme sequences, pitch contour data and speaker identification data. The speech synthesizer may include a voice modeling neural network and a conditioning neural network.
    Type: Grant
    Filed: February 14, 2019
    Date of Patent: December 27, 2022
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Cong Zhou, Michael Getty Horgan, Vivek Kumar, Jaime H. Morales, Cristina Michel Vasco
  • Patent number: 11488575
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech synthesis. The methods, systems, and apparatus include actions of obtaining an audio representation of speech of a target speaker, obtaining input text for which speech is to be synthesized in a voice of the target speaker, generating a speaker vector by providing the audio representation to a speaker encoder engine that is trained to distinguish speakers from one another, generating an audio representation of the input text spoken in the voice of the target speaker by providing the input text and the speaker vector to a spectrogram generation engine that is trained using voices of reference speakers to generate audio representations, and providing the audio representation of the input text spoken in the voice of the target speaker for output.
    Type: Grant
    Filed: May 17, 2019
    Date of Patent: November 1, 2022
    Assignee: Google LLC
    Inventors: Ye Jia, Zhifeng Chen, Yonghui Wu, Jonathan Shen, Ruoming Pang, Ron J. Weiss, Ignacio Lopez Moreno, Fei Ren, Yu Zhang, Quan Wang, Patrick Nguyen
  • Patent number: 11457313
    Abstract: An enhancement method for learning is defined by a combination of intelligent application acoustic signals and filters to enhance learning. Data from a training database can be used to modify the enhancement to a pre-recorded audio presentation portion of the learning materials. The method preferably optimizes the learning materials based on the profile of the learner and includes audio or video enhancements to improve retention of the learning material.
    Type: Grant
    Filed: September 4, 2019
    Date of Patent: September 27, 2022
    Assignee: SOCIETY OF CABLE TELECOMMUNICATIONS ENGINEERS, INC.
    Inventors: Mark Dzuban, Christopher Bastian, Margaret Bernroth
  • Patent number: 11386914
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of audio data that comprises a respective audio sample at each of a plurality of time steps. One of the methods includes, for each of the time steps: providing a current sequence of audio data as input to a convolutional subnetwork, wherein the current sequence comprises the respective audio sample at each time step that precedes the time step in the output sequence, and wherein the convolutional subnetwork is configured to process the current sequence of audio data to generate an alternative representation for the time step; and providing the alternative representation for the time step as input to an output layer, wherein the output layer is configured to: process the alternative representation to generate an output that defines a score distribution over a plurality of possible audio samples for the time step.
    Type: Grant
    Filed: September 14, 2020
    Date of Patent: July 12, 2022
    Assignee: DeepMind Technologies Limited
    Inventors: Aaron Gerard Antonius van den Oord, Sander Etienne Lea Dieleman, Nal Emmerich Kalchbrenner, Karen Simonyan, Oriol Vinyals
  • Patent number: 10714097
    Abstract: A frame error concealment (FEC) method is provided. The method includes: selecting an FEC mode based on states of a current frame and a previous frame of the current frame in a time domain signal generated after time-frequency inverse transform processing; and performing corresponding time domain error concealment processing on the current frame based on the selected FEC mode, wherein the current frame is an error frame or the current frame is a normal frame when the previous frame is an error frame.
    Type: Grant
    Filed: October 5, 2018
    Date of Patent: July 14, 2020
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ho-sang Sung, Nam-suk Lee
  • Patent number: 10643248
    Abstract: A content server provides a client device with audio content including an audio advertisement, which is provided in response to receiving a request for digital audio content from a client device associated with a user. The content server obtains user information about the user and retrieves advertisement text received from an advertiser, which are used to generate a personalized text advertisement. The personalized text advertisement is generated according to an advertisement template specifying an ordered combination of text components. The personalized text advertisement includes the received advertisement text, user information text selected from the obtained user information, and template text. The client device is provided with an advertisement based on the personalized text advertisement and is configured to play an audio version of the personalized text advertisement. The audio advertisement is generated using a text-to-speech algorithm at the client device or at the content server.
    Type: Grant
    Filed: April 2, 2018
    Date of Patent: May 5, 2020
    Assignee: Pandora Media, LLC
    Inventors: Shriram Bharath, Jacek Adam Krawczyk, Christopher Irwin
  • Patent number: 10559109
    Abstract: A skin deformation system for use in computer animation is disclosed. The skin deformation system accesses the skeleton structure of a computer generated character, and accesses a user's identification of features of the skeleton structure that may affect a skin deformation. The system also accesses the user's identification of a weighting strategy. Using the identified weighting strategy and identified features of the skeleton structure, the skin deformation system determines the degree to which each feature identified by the user may influence the deformation of a skin of the computer generated character. The skin deformation system may incorporate secondary operations including bulge, slide, scale and twist into the deformation of a skin. Information relating to a deformed skin may be stored by the skin deformation system so that the information may be used to produce a visual image for a viewer.
    Type: Grant
    Filed: October 2, 2017
    Date of Patent: February 11, 2020
    Assignee: DreamWorks Animation L.L.C.
    Inventors: Paul Carmen DiLorenzo, Matthew Christopher Gong, Arthur D. Gregory
  • Patent number: 10141008
    Abstract: A voice signal may be adjusted to mask traits such as the gender of a speaker by separating source and filter components of a voice signal using cepstral analysis, adjusting the components based on pitch and formant parameters, and synthesizing a modified signal. Features are disclosed to support real-time voice masking in a computer network by limiting computational complexity and reducing delays in processing and transmission while maintaining signal quality.
    Type: Grant
    Filed: February 26, 2018
    Date of Patent: November 27, 2018
    Assignee: Interviewing.io, Inc.
    Inventors: Andrew Tatanka Marsh, Steven Young Yi
  • Patent number: 9947341
    Abstract: A voice signal may be adjusted to mask traits such as the gender of a speaker by separating source and filter components of a voice signal using cepstral analysis, adjusting the components based on pitch and formant parameters, and synthesizing a modified signal. Features are disclosed to support real-time voice masking in a computer network by limiting computational complexity and reducing delays in processing and transmission while maintaining signal quality.
    Type: Grant
    Filed: January 18, 2017
    Date of Patent: April 17, 2018
    Assignee: Interviewing.io, Inc.
    Inventors: Andrew Tatanka Marsh, Steven Young Yi
  • Patent number: 9905219
    Abstract: According to one embodiment, a speech synthesis apparatus is provided with generation, normalization, interpolation and synthesis units. The generation unit generates a first parameter using a prosodic control dictionary of a target speaker and one or more second parameters using a prosodic control dictionary of one or more standard speakers based on language information for an input text. The normalization unit normalizes the one or more second parameters based a normalization parameter. The interpolation unit interpolates the first parameter and the one or more normalized second parameters based on weight information to generate a third parameter and the synthesis unit generates synthesized speech using the third parameter.
    Type: Grant
    Filed: August 16, 2013
    Date of Patent: February 27, 2018
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kentaro Tachibana, Takehiko Kagoshima, Masahiro Morita
  • Patent number: 9811881
    Abstract: A method of enhancing an image includes increasing sampling rate of a first image to a target sampling rate to form an interpolated image. The method also includes processing a second image through a high pass filter to form a high pass features image, wherein the second image is at the target sampling rate. The method also includes extracting detail from the high pass features image relevant to the first image, merging the detail from the high pass features image with the interpolated image to form a prediction image at the target sampling rate, and outputting the prediction image.
    Type: Grant
    Filed: December 9, 2015
    Date of Patent: November 7, 2017
    Assignee: Goodrich Corporation
    Inventors: Suhail Shabbir Saquib, Christopher Gittins
  • Patent number: 9786083
    Abstract: A skin deformation system for use in computer animation is disclosed. The skin deformation system accesses the skeleton structure of a computer generated character, and accesses a user's identification of features of the skeleton structure that may affect a skin deformation. The system also accesses the user's identification of a weighting strategy. Using the identified weighting strategy and identified features of the skeleton structure, the skin deformation system determines the degree to which each feature identified by the user may influence the deformation of a skin of the computer generated character. The skin deformation system may incorporate secondary operations including bulge, slide, scale and twist into the deformation of a skin. Information relating to a deformed skin may be stored by the skin deformation system so that the information may be used to produce a visual image for a viewer.
    Type: Grant
    Filed: October 7, 2011
    Date of Patent: October 10, 2017
    Assignee: DreamWorks Animation L.L.C.
    Inventors: Paul Carmen Dilorenzo, Matthew Christopher Gong, Arthur D. Gregory
  • Patent number: 9686594
    Abstract: A system, method, and apparatus to allow an operator of a broadcast communication system, such as a cable television or satellite television service to provide some examples, to diagnose performance of this communication system remotely. The operator of a first communication device, such as a cable modem termination system (CMTS) to provide an example, may remotely diagnosis performance problems, or potential performance problems, occurring at a second communication device, such as a cable modem (CM) to provide an example, or a group of second communication devices. For example, the operator of the first communication device may view a spectrum analysis of communication signals being routed to, processed by, and/or provided by the second communication device, or group of second communication devices, to diagnose the performance problems, or the potential performance problems, in real time.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: June 20, 2017
    Assignee: Avago Technologies General IP (Singapore) Pte. Ltd.
    Inventors: Ramon Alejandro Gomez, Leonard Dauphinee, Donald G. McMullin, Harold Raymond Whitehead
  • Patent number: 9502029
    Abstract: Described herein are systems and methods for context-aware speech processing. A speech context is determined based on context data associated with a user uttering speech. The speech context and the speech uttered in that speech context may be used to build acoustic models for that speech context. An acoustic model for use in speech processing may be selected based on the determined speech context. A language model for use in speech processing may also be selected based on the determined speech context. Using the acoustic and language models, the speech may be processed to recognize the speech from the user.
    Type: Grant
    Filed: June 25, 2012
    Date of Patent: November 22, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Matthew P. Bell, Yuzo Watanabe, Stephen M. Polansky
  • Patent number: 9495978
    Abstract: A method of processing a sound signal is disclosed. The method of processing a sound signal includes receiving a sound signal from the outside of a device, converting the sound signal into a first frequency domain signal, determining whether or not the sound signal is a voice signal using the first frequency domain signal acquired through the conversion, converting the first frequency domain signal into a second frequency domain signal based on the determination, and recognizing the sound signal using the second frequency domain signal acquired through the conversion.
    Type: Grant
    Filed: December 4, 2015
    Date of Patent: November 15, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Seok-hwan Jo, Do-hyung Kim, Jae-hyun Kim, Shi-hwa Lee
  • Patent number: 9373331
    Abstract: An error concealment method and apparatus for an audio signal and a decoding method and apparatus for an audio signal using the error concealment method and apparatus. The error concealment method includes selecting one of an error concealment in a frequency domain and an error concealment in a time domain as an error concealment scheme for a current frame based on a predetermined criteria when an error occurs in the current frame, selecting one of a repetition scheme and an interpolation scheme in the frequency domain as the error concealment scheme for the current frame based on a predetermined criteria when the error concealment in the frequency domain is selected, and concealing the error of the current frame using the selected scheme.
    Type: Grant
    Filed: July 2, 2013
    Date of Patent: June 21, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Eun-mi Oh, Ki-hyun Choo, Ho-sang Sung, Chang-yong Son, Jung-hoe Kim, Kang eun Lee
  • Patent number: 9336789
    Abstract: A method for determining an interpolation factor set by an electronic device is described. The method includes determining a value based on a current frame property and a previous frame property. The method also includes determining whether the value is outside of a range. The method further includes determining an interpolation factor set based on the value and a prediction mode indicator if the value is outside of the range. The method additionally includes synthesizing a speech signal.
    Type: Grant
    Filed: August 30, 2013
    Date of Patent: May 10, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Vivek Rajendran, Subasingha Shaminda Subasingha, Venkatesh Krishnan
  • Patent number: 9230537
    Abstract: A voice signal is synthesized using a plurality of phonetic piece data each indicating a phonetic piece containing at least two phoneme sections corresponding to different phonemes. In the apparatus, a phonetic piece adjustor forms a target section from first and second phonetic pieces so as to connect the first and second phonetic pieces to each other such that the target section includes a rear phoneme section of the first piece and a front phoneme section of the second piece, and expands the target section by a target time length to form an adjustment section such that a central part is expanded at an expansion rate higher than that of front and rear parts of the target section, to thereby create synthesized phonetic piece data having the target time length. A voice synthesizer creates a voice signal from the synthesized phonetic piece data.
    Type: Grant
    Filed: May 31, 2012
    Date of Patent: January 5, 2016
    Assignee: Yamaha Corporation
    Inventor: Keijiro Saino
  • Patent number: 9105272
    Abstract: Methods, apparatus and computer program products implement embodiments of the present invention that include receiving a time domain voice signal, and extracting a single pitch cycle from the received signal. The extracted single pitch cycle is transformed to a frequency domain, and the misclassified roots of the frequency domain are identified and corrected. Using the corrected roots, an indication of a maximum phase of the frequency domain is generated.
    Type: Grant
    Filed: June 4, 2012
    Date of Patent: August 11, 2015
    Assignees: The Lithuanian University of Health Sciences, INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Aharon Satt, Zvi Kons, Ron Hoory, Virgilijus Ulozas
  • Patent number: 9058807
    Abstract: According to one embodiment, a first storage unit stores n band noise signals obtained by applying n band-pass filters to a noise signal. A second storage unit stores n band pulse signals. A parameter input unit inputs a fundamental frequency, n band noise intensities, and a spectrum parameter. A extraction unit extracts for each pitch mark the n band noise signals while shifting. An amplitude control unit changes amplitudes of the extracted band noise signals and band pulse signals in accordance with the band noise intensities. A generation unit generates a mixed sound source signal by adding the n band noise signals and the n band pulse signals. A generation unit generates the mixed sound source signal generated based on the pitch mark. A vocal tract filter unit generates a speech waveform by applying a vocal tract filter using the spectrum parameter to the generated mixed sound source signal.
    Type: Grant
    Filed: March 18, 2011
    Date of Patent: June 16, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
  • Patent number: 9020812
    Abstract: Disclosed is an audio signal processing method comprising the steps of: receiving an audio signal containing current frame data; generating a first temporary output signal for the current frame when an error occurs in the current frame data, by carrying out frame error concealment with respect to the current frame data a random codebook; generating a parameter by carrying out one or more of short-term prediction, long-term prediction and a fixed codebook search based on the first temporary output signal; and memory updating the parameter for the next frame; wherein the parameter comprises one or more of pitch gain, pitch delay, fixed codebook gain and a fixed codebook.
    Type: Grant
    Filed: November 24, 2010
    Date of Patent: April 28, 2015
    Assignees: LG Electronics Inc., Industry-Academic Cooperation Foundation, Yonsei University
    Inventors: Hye Jeong Jeon, Dae Hwan Kim, Hong Goo Kang, Min Ki Lee, Byung Suk Lee, Gyu Hyeok Jeong
  • Publication number: 20150112688
    Abstract: A system and method may be configured to reconstruct an audio signal from transformed audio information. The audio signal may be resynthesized based on individual harmonics and corresponding pitches determined from the transformed audio information. Noise may be subtracted from the transformed audio information by interpolating across peak points and across trough points of harmonic pitch paths through the transformed audio information, and subtracting values associated with the trough point interpolations from values associated with the peak point interpolations. Noise between harmonics of the sound may be suppressed in the transformed audio information by centering functions at individual harmonics in the transformed audio information, the functions serving to suppress noise between the harmonics.
    Type: Application
    Filed: December 22, 2014
    Publication date: April 23, 2015
    Applicant: THE INTELLISIS CORPORATION
    Inventors: David C. BRADLEY, Daniel S. GOLDIN, Robert N. HILTON, Nicholas K. FISHER, Rodney GATEAU
  • Patent number: 8996378
    Abstract: In a voice synthesis apparatus, a phoneme piece interpolator acquires first phoneme piece data corresponding to a first value of sound characteristic, and second phoneme piece data corresponding to a second value of the sound characteristic. The first and second phoneme piece data indicate a spectrum of each frame of a phoneme piece. The phoneme piece interpolator interpolates between each frame of the first phoneme piece data and each frame of the second phoneme piece data so as to create phoneme piece data of the phoneme piece corresponding to a target value of the sound characteristic which is different from either of the first and second values of the sound characteristic. A voice synthesizer generates a voice signal having the target value of the sound characteristic based on the created phoneme piece data.
    Type: Grant
    Filed: May 24, 2012
    Date of Patent: March 31, 2015
    Assignee: Yamaha Corporation
    Inventors: Jordi Bonada, Merlijn Blaauw, Makoto Tachibana
  • Patent number: 8990094
    Abstract: An electronic device for coding a transient frame is described. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor. The electronic device obtains a current transient frame. The electronic device also obtains a residual signal based on the current transient frame. Additionally, the electronic device determines a set of peak locations based on the residual signal. The electronic device further determines whether to use a first coding mode or a second coding mode for coding the current transient frame based on at least the set of peak locations. The electronic device also synthesizes an excitation based on the first coding mode if the first coding mode is determined. The electronic device also synthesizes an excitation based on the second coding mode if the second coding mode is determined.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: March 24, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Venkatesh Krishnan, Ananthapadmanabhan Arasanipalai Kandhadai
  • Patent number: 8868423
    Abstract: Systems and methods for controlling access to resources using spoken Completely Automatic Public Turing Tests To Tell Humans And Computers Apart (CAPTCHA) tests are disclosed. In these systems and methods, entities seeking access to resources are required to produce an input utterance that contains at least some audio. That utterance is compared with voice reference data for human and machine entities, and a determination is made as to whether the entity requesting access is a human or a machine. Access is then permitted or refused based on that determination.
    Type: Grant
    Filed: July 11, 2013
    Date of Patent: October 21, 2014
    Assignee: John Nicholas and Kristin Gross Trust
    Inventor: John Nicholas Gross
  • Patent number: 8862461
    Abstract: In one embodiment, a method executed by at least one processor includes receiving text from submitted by a user. The method also includes determining a text score for the received text by comparing a first set of phrases included in the received text to a second set of phrases. The second set of phrases includes phrases from stored text. The stored text includes stored text known to be genuine and stored text known to be fraudulent. The method also includes determining that the received text is fraudulent based on the text score.
    Type: Grant
    Filed: November 30, 2011
    Date of Patent: October 14, 2014
    Assignee: Match.com, LP
    Inventors: Aaron J. de Zeeuw, Clark T. Rothrock, Jason L. Alexander
  • Patent number: 8825476
    Abstract: Provided are a method and apparatus for encoding and decoding a high frequency signal by using a low frequency signal. The high frequency signal can be encoded by extracting a coefficient by linear predicting a high frequency signal, and encoding the coefficient, generating a signal by using the extracted coefficient and a low frequency signal, and encoding the high frequency signal by calculating a ratio between the high frequency signal and an energy value of the generated signal. Also, the high frequency signal can be decoded by decoding a coefficient, which is extracted by linear predicting a high frequency signal, and a low frequency signal, and generating a signal by using the decoded coefficient and the decoded low frequency signal, and adjusting the generated signal by decoding a ratio between the generated signal and an energy value of the high frequency signal.
    Type: Grant
    Filed: April 8, 2013
    Date of Patent: September 2, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ki-hyun Choo, Lei Miao, Eun-mi Oh
  • Patent number: 8812316
    Abstract: A speech control system that can recognize a spoken command and associated words (such as “call mom at home”) and can cause a selected application (such as a telephone dialer) to execute the command to cause a data processing system, such as a smartphone, to perform an operation based on the command (such as look up mom's phone number at home and dial it to establish a telephone call). The speech control system can use a set of interpreters to repair recognized text from a speech recognition system, and results from the set can be merged into a final repaired transcription which is provided to the selected application.
    Type: Grant
    Filed: June 5, 2014
    Date of Patent: August 19, 2014
    Assignee: Apple Inc.
    Inventor: Lik Harry Chen
  • Patent number: 8805695
    Abstract: A bandwidth expansion method and apparatus are disclosed, where the method includes: estimating a bandwidth of at least one decoded frame of a whole-band signal, so as to obtain an estimated bandwidth, where the estimated bandwidth corresponds to a whole-band signal that a decoded lower-band signal needs to be extended into; performing first predictive decoding on a part of the lower-band signal in a band above an effective bandwidth of the lower-band signal and below the estimated bandwidth, so as to obtain the part of the lower-band signal above the effective bandwidth of the lower-band signal and below the estimated bandwidth; and performing second predictive decoding on a part of the lower-band signal in a band above the estimated bandwidth, so as to obtain the part of the lower-band signal above the estimated bandwidth.
    Type: Grant
    Filed: July 22, 2013
    Date of Patent: August 12, 2014
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Zexin Liu, Lei Miao
  • Patent number: 8762156
    Abstract: A speech control system that can recognize a spoken command and associated words (such as “call mom at home”) and can cause a selected application (such as a telephone dialer) to execute the command to cause a data processing system, such as a smartphone, to perform an operation based on the command (such as look up mom's phone number at home and dial it to establish a telephone call). The speech control system can use a set of interpreters to repair recognized text from a speech recognition system, and results from the set can be merged into a final repaired transcription which is provided to the selected application.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: June 24, 2014
    Assignee: Apple Inc.
    Inventor: Lik Harry Chen
  • Patent number: 8731913
    Abstract: A method for overlap-adding signals useful for performing frame loss concealment (FLC) in an audio decoder as well as in other applications. The method uses a dynamic mix of windows to overlap two signals whose normalized cross-correlation may vary from zero to one. If the overlapping signals are decomposed into a correlated component and an uncorrelated component, they are overlap-added separately using the appropriate window, and then added together. If the overlapping signals are not decomposed, a weighted mix of windows is used. The mix is determined by a measure estimating the amount of cross-correlation between overlapping signals, or the relative amount of correlated to uncorrelated signals.
    Type: Grant
    Filed: April 13, 2007
    Date of Patent: May 20, 2014
    Assignee: Broadcom Corporation
    Inventors: Robert W. Zopf, Juin-Hwey Chen
  • Patent number: 8706493
    Abstract: In one embodiment of a controllable prosody re-estimation system, a TTS/STS engine consists of a prosody prediction/estimation module, a prosody re-estimation module and a speech synthesis module. The prosody prediction/estimation module generates predicted or estimated prosody information. And then the prosody re-estimation module re-estimates the predicted or estimated prosody information and produces new prosody information, according to a set of controllable parameters provided by a controllable prosody parameter interface. The new prosody information is provided to the speech synthesis module to produce a synthesized speech.
    Type: Grant
    Filed: July 11, 2011
    Date of Patent: April 22, 2014
    Assignee: Industrial Technology Research Institute
    Inventors: Cheng-Yuan Lin, Chien-Hung Huang, Chih-Chung Kuo
  • Patent number: 8655663
    Abstract: An audio signal interpolation device is presented, including an input unit for receiving an input audio signal, a phase splitting unit for splitting the input audio signal, a high range interpolation unit for interpolating a high range component into the signal, a phase combining unit for combining an in-phase component signal with a differential phase component, a high-pass filter for high-pass filtering the audio signal from by the phase combining unit, a delay unit for producing a delayed audio signal, and an addition processing unit for adding the delayed audio signal to the audio signal output from the high-pass filter.
    Type: Grant
    Filed: September 29, 2008
    Date of Patent: February 18, 2014
    Assignee: D&M Holdings, Inc.
    Inventors: Masaki Matsuoka, Shigeki Namiki
  • Patent number: 8626809
    Abstract: A method and an apparatus for digital up-down conversion using an Infinite Impulse Response (IIR) filter are provided. The method for digital up-down conversion for frequency conversion in a mobile communication system using plural frequency converts, includes IIR-filtering, by a magnitude response IIR filter having the same magnitude response as in Finite Impulse Response (FIR) filtering, an input signal and a stable filter coefficient calculated according to a Levinson polynomial; and receiving, by the magnitude response IIR filter, the IIR filtered signal, and performing IIR filtering by a phase compensation IIR filter having a filter coefficient compensating for a non-linear phase to a linear phase.
    Type: Grant
    Filed: February 24, 2010
    Date of Patent: January 7, 2014
    Assignees: Samsung Electronics Co., Ltd, Soongsil University
    Inventors: Jun-Seok Yang, Won-Cheol Lee, Hyung-Min Jang
  • Patent number: 8620646
    Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and a representation of harmonic envelope at the estimated pitch. The estimated pitch and the representation of harmonic envelope may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.
    Type: Grant
    Filed: August 8, 2011
    Date of Patent: December 31, 2013
    Assignee: The Intellisis Corporation
    Inventors: David C. Bradley, Rodney Gateau, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher
  • Patent number: 8589166
    Abstract: Systems and methods are described for performing packet loss concealment (PLC) to mitigate the effect of one or more lost frames within a series of frames that represent a speech signal. In accordance with the exemplary systems and methods, PLC is performed by searching a codebook of speech-related parameter profiles to identify content that is being spoken and by selecting a profile associated with the identified content for use in predicting or estimating speech-related parameter information associated with one or more lost frames of a speech signal. The predicted/estimated speech-related parameter information is then used to synthesize one or more frames to replace the lost frame(s) of the speech signal.
    Type: Grant
    Filed: September 21, 2010
    Date of Patent: November 19, 2013
    Assignee: Broadcom Corporation
    Inventor: Robert W. Zopf
  • Publication number: 20130282378
    Abstract: The invention provides a system, method, and business model for an information system and service having business self-promotion, promotion and promotion tracking, loyalty or frequent participant rewards and redemption, audio coupon, ratings, and other features. A business or organization in which consumers call into a service using ordinary telephone, PC, PDA, or other information appliance, and make requests in plain speech for information on goods and/or services, and the service provides responses to the request in plain speech in real-time.
    Type: Application
    Filed: August 1, 2005
    Publication date: October 24, 2013
    Inventors: Ahmet Alpdemir, Arthur James
  • Publication number: 20130246068
    Abstract: Disclosed are a method and apparatus for decoding a an audiospeech signal using an adaptive codebook update. The method for decoding speechan audio signal includes: receiving an N+1-th normal frame data that is a normal frame transmitted after an N-th frame that is a loss frame data loss; determining whether an adaptive codebook of a final subframe of the N-th frame is updated or notby using the N-th frame and the N+1-th frame; updating the adaptive codebook of the final subframe of the N-th frame by using athe pitch index of the N+1-the frame; and synthesizing an audio a speech signal of by using the N+1-th frame.
    Type: Application
    Filed: September 28, 2011
    Publication date: September 19, 2013
    Applicant: Electronics and Telecommunications Research Institute
    Inventor: Mi-Suk Lee
  • Patent number: 8494854
    Abstract: An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance using optimized challenge items selected for their discrimination capability to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.
    Type: Grant
    Filed: June 15, 2009
    Date of Patent: July 23, 2013
    Assignee: John Nicholas and Kristin Gross
    Inventor: John Nicholas Gross
  • Patent number: 8489399
    Abstract: An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.
    Type: Grant
    Filed: June 15, 2009
    Date of Patent: July 16, 2013
    Assignee: John Nicholas and Kristin Gross Trust
    Inventor: John Nicholas Gross
  • Patent number: 8473301
    Abstract: A method for decoding an audio signal includes: obtaining a lower-band signal component of an audio signal corresponding to a received code stream when the audio signal switches from a first bandwidth to a second bandwidth which is narrower than the first bandwidth; extending the lower-band signal component to obtain higher-band information; performing a time-varying fadeout process on the higher-band information to obtain a processed higher-band signal component; and synthesizing the processed higher-band signal component and the obtained lower-band signal component. With the methods provided in the embodiments of the invention, when an audio signal has a switch from broadband to narrowband, a series of processes such as bandwidth detection, artificial band extension, time-varying fadeout process, and bandwidth synthesis, may be performed to make the switch to have a smooth transition from a broadband signal to a narrowband signal so that a comfortable listening experience may be achieved.
    Type: Grant
    Filed: May 1, 2010
    Date of Patent: June 25, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Zhe Chen, Fuliang Yin, Xiaoyu Zhang, Jinliang Dai, Libin Zhang