Interpolation Patents (Class 704/265)
-
Patent number: 12141689Abstract: Systems and methods for generating a representative value of a data set by first compressing a portion of values in the data set to determine a first common value and further compressing a subset of the portion of values to determine a second common value. The representative value is generated by taking the difference between the first common value and the second common value, wherein the representative value corresponds to a mathematical relationship between the first and second common values and each value within the subset of the portion of values. The representative value requires less storage than the first and second common values.Type: GrantFiled: March 18, 2019Date of Patent: November 12, 2024Assignee: NVIDIA CorporationInventor: David Rigel Garcia Garcia
-
Patent number: 12106330Abstract: Method and system for generation of audio clip replicating the voice of a human speaker that may be dynamically inserted as an audio clip in digitally requested media files, such as podcasts, streams and broadcasts. Using a sample of speech from a previously-recorded audio file, a streaming audio source or a broadcast, a text-to-speech synthesis engine mimicking or cloning the voice present in the audio input is used to generate novel audio clip which is inserted in the requested media file.Type: GrantFiled: November 11, 2021Date of Patent: October 1, 2024Inventors: Alberto Betella, Benjamin Richardson
-
Patent number: 12086564Abstract: A system and method for masking an identity of a speaker of natural language speech, such as speech clips to be labeled by humans in a system generating voice transcriptions for training an automatic speech recognition model. The natural language speech is morphed prior to being presented to the human for labeling. In one embodiment, morphing comprises pitch shifting the speech randomly either up or down, then frequency shifting the speech, then pitch shifting the speech in a direction opposite the first pitch shift. Labeling the morphed speech comprises at least one or more of transcribing the morphed speech, identifying a gender of the speaker, identifying an accent of the speaker, and identifying a noise type of the morphed speech.Type: GrantFiled: November 30, 2021Date of Patent: September 10, 2024Assignee: SoundHound AI IP, LLC.Inventor: Dylan H. Ross
-
Patent number: 12080310Abstract: An audio encoder for encoding an audio signal has: a first encoding processor for encoding a first audio signal portion in a frequency domain, having: a time frequency converter for converting the first audio signal portion into a frequency domain representation; an analyzer for analyzing the frequency domain representation to determine first spectral portions to be encoded with a first spectral resolution and second regions to be encoded with a second resolution; and a spectral encoder for encoding the first spectral portions with the first spectral resolution and encoding the second portions with the second resolution; a second encoding processor for encoding a second different audio signal portion in the time domain; a controller for analyzing and determining, which portion of the audio signal is the first audio signal portion encoded in the frequency domain and which portion is the second audio signal portion encoded in the time domain; and an encoded signal former for forming an encoded audio signal haviType: GrantFiled: June 1, 2021Date of Patent: September 3, 2024Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.Inventors: Sascha Disch, Martin Dietz, Markus Multrus, Guillaume Fuchs, Emmanuel Ravelli, Matthias Neusinger, Markus Schnell, Benjamin Schubert, Bernhard Grill
-
Patent number: 11929084Abstract: An audio encoder for encoding an audio signal has: a first encoding processor for encoding a first audio signal portion in a frequency domain, having: a time frequency converter for converting the first audio signal portion into a frequency domain representation; an analyzer for analyzing the frequency domain representation to determine first spectral portions to be encoded with a first spectral resolution and second regions to be encoded with a second resolution; and a spectral encoder for encoding the first spectral portions with the first spectral resolution and encoding the second portions with the second resolution; a second encoding processor for encoding a second different audio signal portion in the time domain; a controller for analyzing and determining, which portion of the audio signal is the first audio signal portion encoded in the frequency domain and which portion is the second audio signal portion encoded in the time domain; and an encoded signal former for forming an encoded audio signal haviType: GrantFiled: January 23, 2023Date of Patent: March 12, 2024Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.Inventors: Sascha Disch, Martin Dietz, Markus Multrus, Guillaume Fuchs, Emmanuel Ravelli, Matthias Neusinger, Markus Schnell, Benjamin Schubert, Bernhard Grill
-
Patent number: 11847726Abstract: A method for outputting a blend shape value includes: performing feature extraction on obtained target audio data to obtain a target audio feature vector; inputting the target audio feature vector and a target identifier into an audio-driven animation model; inputting the target audio feature vector into an audio encoding layer, determining an input feature vector of a next layer at a (2t?n)/2 time point based on an input feature vector of a previous layer between a t time point and a t?n time point, determining a feature vector having a causal relationship with the input feature vector of the previous layer as a valid feature vector, outputting sequentially target-audio encoding features, and inputting the target identifier into a one-hot encoding layer for binary vector encoding to obtain a target-identifier encoding feature; and outputting a blend shape value corresponding to the target audio data.Type: GrantFiled: July 22, 2022Date of Patent: December 19, 2023Assignee: Nanjing Silicon Intelligence Technology Co., Ltd.Inventors: Huapeng Sima, Cuicui Tang, Zheng Liao
-
Patent number: 11830481Abstract: Methods are performed by one or more processing devices for correcting prosody in audio data. A method includes operations for accessing subject audio data in an audio edit region of the audio data. The subject audio data in the audio edit region potentially lacks prosodic continuity with unedited audio data in an unedited audio portion of the audio data. The operations further include predicting, based on a context of the unedited audio data, phoneme durations including a respective phoneme duration of each phoneme in the unedited audio data. The operations further include predicting, based on the context of the unedited audio data, a pitch contour comprising at least one respective pitch value of each phoneme in the unedited audio data. Additionally, the operations include correcting prosody of the subject audio data in the audio edit region by applying the phoneme durations and the pitch contour to the subject audio data.Type: GrantFiled: November 30, 2021Date of Patent: November 28, 2023Assignee: Adobe Inc.Inventors: Maxwell Morrison, Zeyu Jin, Nicholas Bryan, Juan Pablo Caceres Chomali, Lucas Rencker
-
Patent number: 11714788Abstract: According to an embodiment, a method of building a database in which voice signals match texts comprises providing a captcha-purposed voice signal including a first voice signal matched with a first text and a second voice signal matched with no text, sending a request for a first input text and a second input text for the captcha-purposed voice signal, when the first input text and the second input text are received, comparing the first text with the first input text, and when the first text is identical to the first input text, matching the second voice signal with the second input text and storing the match. Embodiments of the present invention may be related to artificial intelligence (Al) modules, unmanned aerial vehicles (UAVs), robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.Type: GrantFiled: September 17, 2019Date of Patent: August 1, 2023Assignee: LG ELECTRONICS INC.Inventor: Dami Kim
-
Patent number: 11537798Abstract: Embodiments of the present disclosure relate to a method and apparatus for generating a dialogue model. The method may include: acquiring a corpus sample set, a corpus sample including input information and target response information; classifying corpus samples in the corpus sample set, setting discrete hidden variables for the corpus samples based on a classification result to generate a training sample set, a training sample including the input information, the target response information, and a discrete hidden variable; and training a preset neural network using the training sample set to obtain the dialogue model, the dialogue model being used to represent a corresponding relationship between inputted input information and outputted target response information.Type: GrantFiled: June 8, 2020Date of Patent: December 27, 2022Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.Inventors: Siqi Bao, Huang He, Junkun Chen, Fan Wang, Hua Wu, Jingzhou He
-
Patent number: 11538455Abstract: Computer-implemented methods for speech synthesis are provided. A speech synthesizer may be trained to generate synthesized audio data that corresponds to words uttered by a source speaker according to speech characteristics of a target speaker. The speech synthesizer may be trained by time-stamped phoneme sequences, pitch contour data and speaker identification data. The speech synthesizer may include a voice modeling neural network and a conditioning neural network.Type: GrantFiled: February 14, 2019Date of Patent: December 27, 2022Assignee: Dolby Laboratories Licensing CorporationInventors: Cong Zhou, Michael Getty Horgan, Vivek Kumar, Jaime H. Morales, Cristina Michel Vasco
-
Patent number: 11488575Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech synthesis. The methods, systems, and apparatus include actions of obtaining an audio representation of speech of a target speaker, obtaining input text for which speech is to be synthesized in a voice of the target speaker, generating a speaker vector by providing the audio representation to a speaker encoder engine that is trained to distinguish speakers from one another, generating an audio representation of the input text spoken in the voice of the target speaker by providing the input text and the speaker vector to a spectrogram generation engine that is trained using voices of reference speakers to generate audio representations, and providing the audio representation of the input text spoken in the voice of the target speaker for output.Type: GrantFiled: May 17, 2019Date of Patent: November 1, 2022Assignee: Google LLCInventors: Ye Jia, Zhifeng Chen, Yonghui Wu, Jonathan Shen, Ruoming Pang, Ron J. Weiss, Ignacio Lopez Moreno, Fei Ren, Yu Zhang, Quan Wang, Patrick Nguyen
-
Patent number: 11457313Abstract: An enhancement method for learning is defined by a combination of intelligent application acoustic signals and filters to enhance learning. Data from a training database can be used to modify the enhancement to a pre-recorded audio presentation portion of the learning materials. The method preferably optimizes the learning materials based on the profile of the learner and includes audio or video enhancements to improve retention of the learning material.Type: GrantFiled: September 4, 2019Date of Patent: September 27, 2022Assignee: SOCIETY OF CABLE TELECOMMUNICATIONS ENGINEERS, INC.Inventors: Mark Dzuban, Christopher Bastian, Margaret Bernroth
-
Patent number: 11386914Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of audio data that comprises a respective audio sample at each of a plurality of time steps. One of the methods includes, for each of the time steps: providing a current sequence of audio data as input to a convolutional subnetwork, wherein the current sequence comprises the respective audio sample at each time step that precedes the time step in the output sequence, and wherein the convolutional subnetwork is configured to process the current sequence of audio data to generate an alternative representation for the time step; and providing the alternative representation for the time step as input to an output layer, wherein the output layer is configured to: process the alternative representation to generate an output that defines a score distribution over a plurality of possible audio samples for the time step.Type: GrantFiled: September 14, 2020Date of Patent: July 12, 2022Assignee: DeepMind Technologies LimitedInventors: Aaron Gerard Antonius van den Oord, Sander Etienne Lea Dieleman, Nal Emmerich Kalchbrenner, Karen Simonyan, Oriol Vinyals
-
Patent number: 10714097Abstract: A frame error concealment (FEC) method is provided. The method includes: selecting an FEC mode based on states of a current frame and a previous frame of the current frame in a time domain signal generated after time-frequency inverse transform processing; and performing corresponding time domain error concealment processing on the current frame based on the selected FEC mode, wherein the current frame is an error frame or the current frame is a normal frame when the previous frame is an error frame.Type: GrantFiled: October 5, 2018Date of Patent: July 14, 2020Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ho-sang Sung, Nam-suk Lee
-
Patent number: 10643248Abstract: A content server provides a client device with audio content including an audio advertisement, which is provided in response to receiving a request for digital audio content from a client device associated with a user. The content server obtains user information about the user and retrieves advertisement text received from an advertiser, which are used to generate a personalized text advertisement. The personalized text advertisement is generated according to an advertisement template specifying an ordered combination of text components. The personalized text advertisement includes the received advertisement text, user information text selected from the obtained user information, and template text. The client device is provided with an advertisement based on the personalized text advertisement and is configured to play an audio version of the personalized text advertisement. The audio advertisement is generated using a text-to-speech algorithm at the client device or at the content server.Type: GrantFiled: April 2, 2018Date of Patent: May 5, 2020Assignee: Pandora Media, LLCInventors: Shriram Bharath, Jacek Adam Krawczyk, Christopher Irwin
-
Patent number: 10559109Abstract: A skin deformation system for use in computer animation is disclosed. The skin deformation system accesses the skeleton structure of a computer generated character, and accesses a user's identification of features of the skeleton structure that may affect a skin deformation. The system also accesses the user's identification of a weighting strategy. Using the identified weighting strategy and identified features of the skeleton structure, the skin deformation system determines the degree to which each feature identified by the user may influence the deformation of a skin of the computer generated character. The skin deformation system may incorporate secondary operations including bulge, slide, scale and twist into the deformation of a skin. Information relating to a deformed skin may be stored by the skin deformation system so that the information may be used to produce a visual image for a viewer.Type: GrantFiled: October 2, 2017Date of Patent: February 11, 2020Assignee: DreamWorks Animation L.L.C.Inventors: Paul Carmen DiLorenzo, Matthew Christopher Gong, Arthur D. Gregory
-
Patent number: 10141008Abstract: A voice signal may be adjusted to mask traits such as the gender of a speaker by separating source and filter components of a voice signal using cepstral analysis, adjusting the components based on pitch and formant parameters, and synthesizing a modified signal. Features are disclosed to support real-time voice masking in a computer network by limiting computational complexity and reducing delays in processing and transmission while maintaining signal quality.Type: GrantFiled: February 26, 2018Date of Patent: November 27, 2018Assignee: Interviewing.io, Inc.Inventors: Andrew Tatanka Marsh, Steven Young Yi
-
Patent number: 9947341Abstract: A voice signal may be adjusted to mask traits such as the gender of a speaker by separating source and filter components of a voice signal using cepstral analysis, adjusting the components based on pitch and formant parameters, and synthesizing a modified signal. Features are disclosed to support real-time voice masking in a computer network by limiting computational complexity and reducing delays in processing and transmission while maintaining signal quality.Type: GrantFiled: January 18, 2017Date of Patent: April 17, 2018Assignee: Interviewing.io, Inc.Inventors: Andrew Tatanka Marsh, Steven Young Yi
-
Patent number: 9905219Abstract: According to one embodiment, a speech synthesis apparatus is provided with generation, normalization, interpolation and synthesis units. The generation unit generates a first parameter using a prosodic control dictionary of a target speaker and one or more second parameters using a prosodic control dictionary of one or more standard speakers based on language information for an input text. The normalization unit normalizes the one or more second parameters based a normalization parameter. The interpolation unit interpolates the first parameter and the one or more normalized second parameters based on weight information to generate a third parameter and the synthesis unit generates synthesized speech using the third parameter.Type: GrantFiled: August 16, 2013Date of Patent: February 27, 2018Assignee: Kabushiki Kaisha ToshibaInventors: Kentaro Tachibana, Takehiko Kagoshima, Masahiro Morita
-
Patent number: 9811881Abstract: A method of enhancing an image includes increasing sampling rate of a first image to a target sampling rate to form an interpolated image. The method also includes processing a second image through a high pass filter to form a high pass features image, wherein the second image is at the target sampling rate. The method also includes extracting detail from the high pass features image relevant to the first image, merging the detail from the high pass features image with the interpolated image to form a prediction image at the target sampling rate, and outputting the prediction image.Type: GrantFiled: December 9, 2015Date of Patent: November 7, 2017Assignee: Goodrich CorporationInventors: Suhail Shabbir Saquib, Christopher Gittins
-
Patent number: 9786083Abstract: A skin deformation system for use in computer animation is disclosed. The skin deformation system accesses the skeleton structure of a computer generated character, and accesses a user's identification of features of the skeleton structure that may affect a skin deformation. The system also accesses the user's identification of a weighting strategy. Using the identified weighting strategy and identified features of the skeleton structure, the skin deformation system determines the degree to which each feature identified by the user may influence the deformation of a skin of the computer generated character. The skin deformation system may incorporate secondary operations including bulge, slide, scale and twist into the deformation of a skin. Information relating to a deformed skin may be stored by the skin deformation system so that the information may be used to produce a visual image for a viewer.Type: GrantFiled: October 7, 2011Date of Patent: October 10, 2017Assignee: DreamWorks Animation L.L.C.Inventors: Paul Carmen Dilorenzo, Matthew Christopher Gong, Arthur D. Gregory
-
Patent number: 9686594Abstract: A system, method, and apparatus to allow an operator of a broadcast communication system, such as a cable television or satellite television service to provide some examples, to diagnose performance of this communication system remotely. The operator of a first communication device, such as a cable modem termination system (CMTS) to provide an example, may remotely diagnosis performance problems, or potential performance problems, occurring at a second communication device, such as a cable modem (CM) to provide an example, or a group of second communication devices. For example, the operator of the first communication device may view a spectrum analysis of communication signals being routed to, processed by, and/or provided by the second communication device, or group of second communication devices, to diagnose the performance problems, or the potential performance problems, in real time.Type: GrantFiled: March 30, 2012Date of Patent: June 20, 2017Assignee: Avago Technologies General IP (Singapore) Pte. Ltd.Inventors: Ramon Alejandro Gomez, Leonard Dauphinee, Donald G. McMullin, Harold Raymond Whitehead
-
Patent number: 9502029Abstract: Described herein are systems and methods for context-aware speech processing. A speech context is determined based on context data associated with a user uttering speech. The speech context and the speech uttered in that speech context may be used to build acoustic models for that speech context. An acoustic model for use in speech processing may be selected based on the determined speech context. A language model for use in speech processing may also be selected based on the determined speech context. Using the acoustic and language models, the speech may be processed to recognize the speech from the user.Type: GrantFiled: June 25, 2012Date of Patent: November 22, 2016Assignee: Amazon Technologies, Inc.Inventors: Matthew P. Bell, Yuzo Watanabe, Stephen M. Polansky
-
Patent number: 9495978Abstract: A method of processing a sound signal is disclosed. The method of processing a sound signal includes receiving a sound signal from the outside of a device, converting the sound signal into a first frequency domain signal, determining whether or not the sound signal is a voice signal using the first frequency domain signal acquired through the conversion, converting the first frequency domain signal into a second frequency domain signal based on the determination, and recognizing the sound signal using the second frequency domain signal acquired through the conversion.Type: GrantFiled: December 4, 2015Date of Patent: November 15, 2016Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Seok-hwan Jo, Do-hyung Kim, Jae-hyun Kim, Shi-hwa Lee
-
Patent number: 9373331Abstract: An error concealment method and apparatus for an audio signal and a decoding method and apparatus for an audio signal using the error concealment method and apparatus. The error concealment method includes selecting one of an error concealment in a frequency domain and an error concealment in a time domain as an error concealment scheme for a current frame based on a predetermined criteria when an error occurs in the current frame, selecting one of a repetition scheme and an interpolation scheme in the frequency domain as the error concealment scheme for the current frame based on a predetermined criteria when the error concealment in the frequency domain is selected, and concealing the error of the current frame using the selected scheme.Type: GrantFiled: July 2, 2013Date of Patent: June 21, 2016Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Eun-mi Oh, Ki-hyun Choo, Ho-sang Sung, Chang-yong Son, Jung-hoe Kim, Kang eun Lee
-
Patent number: 9336789Abstract: A method for determining an interpolation factor set by an electronic device is described. The method includes determining a value based on a current frame property and a previous frame property. The method also includes determining whether the value is outside of a range. The method further includes determining an interpolation factor set based on the value and a prediction mode indicator if the value is outside of the range. The method additionally includes synthesizing a speech signal.Type: GrantFiled: August 30, 2013Date of Patent: May 10, 2016Assignee: QUALCOMM IncorporatedInventors: Vivek Rajendran, Subasingha Shaminda Subasingha, Venkatesh Krishnan
-
Patent number: 9230537Abstract: A voice signal is synthesized using a plurality of phonetic piece data each indicating a phonetic piece containing at least two phoneme sections corresponding to different phonemes. In the apparatus, a phonetic piece adjustor forms a target section from first and second phonetic pieces so as to connect the first and second phonetic pieces to each other such that the target section includes a rear phoneme section of the first piece and a front phoneme section of the second piece, and expands the target section by a target time length to form an adjustment section such that a central part is expanded at an expansion rate higher than that of front and rear parts of the target section, to thereby create synthesized phonetic piece data having the target time length. A voice synthesizer creates a voice signal from the synthesized phonetic piece data.Type: GrantFiled: May 31, 2012Date of Patent: January 5, 2016Assignee: Yamaha CorporationInventor: Keijiro Saino
-
Patent number: 9105272Abstract: Methods, apparatus and computer program products implement embodiments of the present invention that include receiving a time domain voice signal, and extracting a single pitch cycle from the received signal. The extracted single pitch cycle is transformed to a frequency domain, and the misclassified roots of the frequency domain are identified and corrected. Using the corrected roots, an indication of a maximum phase of the frequency domain is generated.Type: GrantFiled: June 4, 2012Date of Patent: August 11, 2015Assignees: The Lithuanian University of Health Sciences, INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Aharon Satt, Zvi Kons, Ron Hoory, Virgilijus Ulozas
-
Patent number: 9058807Abstract: According to one embodiment, a first storage unit stores n band noise signals obtained by applying n band-pass filters to a noise signal. A second storage unit stores n band pulse signals. A parameter input unit inputs a fundamental frequency, n band noise intensities, and a spectrum parameter. A extraction unit extracts for each pitch mark the n band noise signals while shifting. An amplitude control unit changes amplitudes of the extracted band noise signals and band pulse signals in accordance with the band noise intensities. A generation unit generates a mixed sound source signal by adding the n band noise signals and the n band pulse signals. A generation unit generates the mixed sound source signal generated based on the pitch mark. A vocal tract filter unit generates a speech waveform by applying a vocal tract filter using the spectrum parameter to the generated mixed sound source signal.Type: GrantFiled: March 18, 2011Date of Patent: June 16, 2015Assignee: Kabushiki Kaisha ToshibaInventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
-
Patent number: 9020812Abstract: Disclosed is an audio signal processing method comprising the steps of: receiving an audio signal containing current frame data; generating a first temporary output signal for the current frame when an error occurs in the current frame data, by carrying out frame error concealment with respect to the current frame data a random codebook; generating a parameter by carrying out one or more of short-term prediction, long-term prediction and a fixed codebook search based on the first temporary output signal; and memory updating the parameter for the next frame; wherein the parameter comprises one or more of pitch gain, pitch delay, fixed codebook gain and a fixed codebook.Type: GrantFiled: November 24, 2010Date of Patent: April 28, 2015Assignees: LG Electronics Inc., Industry-Academic Cooperation Foundation, Yonsei UniversityInventors: Hye Jeong Jeon, Dae Hwan Kim, Hong Goo Kang, Min Ki Lee, Byung Suk Lee, Gyu Hyeok Jeong
-
Publication number: 20150112688Abstract: A system and method may be configured to reconstruct an audio signal from transformed audio information. The audio signal may be resynthesized based on individual harmonics and corresponding pitches determined from the transformed audio information. Noise may be subtracted from the transformed audio information by interpolating across peak points and across trough points of harmonic pitch paths through the transformed audio information, and subtracting values associated with the trough point interpolations from values associated with the peak point interpolations. Noise between harmonics of the sound may be suppressed in the transformed audio information by centering functions at individual harmonics in the transformed audio information, the functions serving to suppress noise between the harmonics.Type: ApplicationFiled: December 22, 2014Publication date: April 23, 2015Applicant: THE INTELLISIS CORPORATIONInventors: David C. BRADLEY, Daniel S. GOLDIN, Robert N. HILTON, Nicholas K. FISHER, Rodney GATEAU
-
Patent number: 8996378Abstract: In a voice synthesis apparatus, a phoneme piece interpolator acquires first phoneme piece data corresponding to a first value of sound characteristic, and second phoneme piece data corresponding to a second value of the sound characteristic. The first and second phoneme piece data indicate a spectrum of each frame of a phoneme piece. The phoneme piece interpolator interpolates between each frame of the first phoneme piece data and each frame of the second phoneme piece data so as to create phoneme piece data of the phoneme piece corresponding to a target value of the sound characteristic which is different from either of the first and second values of the sound characteristic. A voice synthesizer generates a voice signal having the target value of the sound characteristic based on the created phoneme piece data.Type: GrantFiled: May 24, 2012Date of Patent: March 31, 2015Assignee: Yamaha CorporationInventors: Jordi Bonada, Merlijn Blaauw, Makoto Tachibana
-
Patent number: 8990094Abstract: An electronic device for coding a transient frame is described. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor. The electronic device obtains a current transient frame. The electronic device also obtains a residual signal based on the current transient frame. Additionally, the electronic device determines a set of peak locations based on the residual signal. The electronic device further determines whether to use a first coding mode or a second coding mode for coding the current transient frame based on at least the set of peak locations. The electronic device also synthesizes an excitation based on the first coding mode if the first coding mode is determined. The electronic device also synthesizes an excitation based on the second coding mode if the second coding mode is determined.Type: GrantFiled: September 8, 2011Date of Patent: March 24, 2015Assignee: QUALCOMM IncorporatedInventors: Venkatesh Krishnan, Ananthapadmanabhan Arasanipalai Kandhadai
-
Patent number: 8868423Abstract: Systems and methods for controlling access to resources using spoken Completely Automatic Public Turing Tests To Tell Humans And Computers Apart (CAPTCHA) tests are disclosed. In these systems and methods, entities seeking access to resources are required to produce an input utterance that contains at least some audio. That utterance is compared with voice reference data for human and machine entities, and a determination is made as to whether the entity requesting access is a human or a machine. Access is then permitted or refused based on that determination.Type: GrantFiled: July 11, 2013Date of Patent: October 21, 2014Assignee: John Nicholas and Kristin Gross TrustInventor: John Nicholas Gross
-
Patent number: 8862461Abstract: In one embodiment, a method executed by at least one processor includes receiving text from submitted by a user. The method also includes determining a text score for the received text by comparing a first set of phrases included in the received text to a second set of phrases. The second set of phrases includes phrases from stored text. The stored text includes stored text known to be genuine and stored text known to be fraudulent. The method also includes determining that the received text is fraudulent based on the text score.Type: GrantFiled: November 30, 2011Date of Patent: October 14, 2014Assignee: Match.com, LPInventors: Aaron J. de Zeeuw, Clark T. Rothrock, Jason L. Alexander
-
Patent number: 8825476Abstract: Provided are a method and apparatus for encoding and decoding a high frequency signal by using a low frequency signal. The high frequency signal can be encoded by extracting a coefficient by linear predicting a high frequency signal, and encoding the coefficient, generating a signal by using the extracted coefficient and a low frequency signal, and encoding the high frequency signal by calculating a ratio between the high frequency signal and an energy value of the generated signal. Also, the high frequency signal can be decoded by decoding a coefficient, which is extracted by linear predicting a high frequency signal, and a low frequency signal, and generating a signal by using the decoded coefficient and the decoded low frequency signal, and adjusting the generated signal by decoding a ratio between the generated signal and an energy value of the high frequency signal.Type: GrantFiled: April 8, 2013Date of Patent: September 2, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Ki-hyun Choo, Lei Miao, Eun-mi Oh
-
Patent number: 8812316Abstract: A speech control system that can recognize a spoken command and associated words (such as “call mom at home”) and can cause a selected application (such as a telephone dialer) to execute the command to cause a data processing system, such as a smartphone, to perform an operation based on the command (such as look up mom's phone number at home and dial it to establish a telephone call). The speech control system can use a set of interpreters to repair recognized text from a speech recognition system, and results from the set can be merged into a final repaired transcription which is provided to the selected application.Type: GrantFiled: June 5, 2014Date of Patent: August 19, 2014Assignee: Apple Inc.Inventor: Lik Harry Chen
-
Patent number: 8805695Abstract: A bandwidth expansion method and apparatus are disclosed, where the method includes: estimating a bandwidth of at least one decoded frame of a whole-band signal, so as to obtain an estimated bandwidth, where the estimated bandwidth corresponds to a whole-band signal that a decoded lower-band signal needs to be extended into; performing first predictive decoding on a part of the lower-band signal in a band above an effective bandwidth of the lower-band signal and below the estimated bandwidth, so as to obtain the part of the lower-band signal above the effective bandwidth of the lower-band signal and below the estimated bandwidth; and performing second predictive decoding on a part of the lower-band signal in a band above the estimated bandwidth, so as to obtain the part of the lower-band signal above the estimated bandwidth.Type: GrantFiled: July 22, 2013Date of Patent: August 12, 2014Assignee: Huawei Technologies Co., Ltd.Inventors: Zexin Liu, Lei Miao
-
Patent number: 8762156Abstract: A speech control system that can recognize a spoken command and associated words (such as “call mom at home”) and can cause a selected application (such as a telephone dialer) to execute the command to cause a data processing system, such as a smartphone, to perform an operation based on the command (such as look up mom's phone number at home and dial it to establish a telephone call). The speech control system can use a set of interpreters to repair recognized text from a speech recognition system, and results from the set can be merged into a final repaired transcription which is provided to the selected application.Type: GrantFiled: September 28, 2011Date of Patent: June 24, 2014Assignee: Apple Inc.Inventor: Lik Harry Chen
-
Patent number: 8731913Abstract: A method for overlap-adding signals useful for performing frame loss concealment (FLC) in an audio decoder as well as in other applications. The method uses a dynamic mix of windows to overlap two signals whose normalized cross-correlation may vary from zero to one. If the overlapping signals are decomposed into a correlated component and an uncorrelated component, they are overlap-added separately using the appropriate window, and then added together. If the overlapping signals are not decomposed, a weighted mix of windows is used. The mix is determined by a measure estimating the amount of cross-correlation between overlapping signals, or the relative amount of correlated to uncorrelated signals.Type: GrantFiled: April 13, 2007Date of Patent: May 20, 2014Assignee: Broadcom CorporationInventors: Robert W. Zopf, Juin-Hwey Chen
-
Patent number: 8706493Abstract: In one embodiment of a controllable prosody re-estimation system, a TTS/STS engine consists of a prosody prediction/estimation module, a prosody re-estimation module and a speech synthesis module. The prosody prediction/estimation module generates predicted or estimated prosody information. And then the prosody re-estimation module re-estimates the predicted or estimated prosody information and produces new prosody information, according to a set of controllable parameters provided by a controllable prosody parameter interface. The new prosody information is provided to the speech synthesis module to produce a synthesized speech.Type: GrantFiled: July 11, 2011Date of Patent: April 22, 2014Assignee: Industrial Technology Research InstituteInventors: Cheng-Yuan Lin, Chien-Hung Huang, Chih-Chung Kuo
-
Patent number: 8655663Abstract: An audio signal interpolation device is presented, including an input unit for receiving an input audio signal, a phase splitting unit for splitting the input audio signal, a high range interpolation unit for interpolating a high range component into the signal, a phase combining unit for combining an in-phase component signal with a differential phase component, a high-pass filter for high-pass filtering the audio signal from by the phase combining unit, a delay unit for producing a delayed audio signal, and an addition processing unit for adding the delayed audio signal to the audio signal output from the high-pass filter.Type: GrantFiled: September 29, 2008Date of Patent: February 18, 2014Assignee: D&M Holdings, Inc.Inventors: Masaki Matsuoka, Shigeki Namiki
-
Patent number: 8626809Abstract: A method and an apparatus for digital up-down conversion using an Infinite Impulse Response (IIR) filter are provided. The method for digital up-down conversion for frequency conversion in a mobile communication system using plural frequency converts, includes IIR-filtering, by a magnitude response IIR filter having the same magnitude response as in Finite Impulse Response (FIR) filtering, an input signal and a stable filter coefficient calculated according to a Levinson polynomial; and receiving, by the magnitude response IIR filter, the IIR filtered signal, and performing IIR filtering by a phase compensation IIR filter having a filter coefficient compensating for a non-linear phase to a linear phase.Type: GrantFiled: February 24, 2010Date of Patent: January 7, 2014Assignees: Samsung Electronics Co., Ltd, Soongsil UniversityInventors: Jun-Seok Yang, Won-Cheol Lee, Hyung-Min Jang
-
Patent number: 8620646Abstract: A system and method may be configured to analyze audio information derived from an audio signal. The system and method may track sound pitch across the audio signal. The tracking of pitch across the audio signal may take into account change in pitch by determining at individual time sample windows in the signal duration an estimated pitch and a representation of harmonic envelope at the estimated pitch. The estimated pitch and the representation of harmonic envelope may then be implemented to determine an estimated pitch for another time sample window in the signal duration with an enhanced accuracy and/or precision.Type: GrantFiled: August 8, 2011Date of Patent: December 31, 2013Assignee: The Intellisis CorporationInventors: David C. Bradley, Rodney Gateau, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher
-
Patent number: 8589166Abstract: Systems and methods are described for performing packet loss concealment (PLC) to mitigate the effect of one or more lost frames within a series of frames that represent a speech signal. In accordance with the exemplary systems and methods, PLC is performed by searching a codebook of speech-related parameter profiles to identify content that is being spoken and by selecting a profile associated with the identified content for use in predicting or estimating speech-related parameter information associated with one or more lost frames of a speech signal. The predicted/estimated speech-related parameter information is then used to synthesize one or more frames to replace the lost frame(s) of the speech signal.Type: GrantFiled: September 21, 2010Date of Patent: November 19, 2013Assignee: Broadcom CorporationInventor: Robert W. Zopf
-
Publication number: 20130282378Abstract: The invention provides a system, method, and business model for an information system and service having business self-promotion, promotion and promotion tracking, loyalty or frequent participant rewards and redemption, audio coupon, ratings, and other features. A business or organization in which consumers call into a service using ordinary telephone, PC, PDA, or other information appliance, and make requests in plain speech for information on goods and/or services, and the service provides responses to the request in plain speech in real-time.Type: ApplicationFiled: August 1, 2005Publication date: October 24, 2013Inventors: Ahmet Alpdemir, Arthur James
-
Publication number: 20130246068Abstract: Disclosed are a method and apparatus for decoding a an audiospeech signal using an adaptive codebook update. The method for decoding speechan audio signal includes: receiving an N+1-th normal frame data that is a normal frame transmitted after an N-th frame that is a loss frame data loss; determining whether an adaptive codebook of a final subframe of the N-th frame is updated or notby using the N-th frame and the N+1-th frame; updating the adaptive codebook of the final subframe of the N-th frame by using athe pitch index of the N+1-the frame; and synthesizing an audio a speech signal of by using the N+1-th frame.Type: ApplicationFiled: September 28, 2011Publication date: September 19, 2013Applicant: Electronics and Telecommunications Research InstituteInventor: Mi-Suk Lee
-
Patent number: 8494854Abstract: An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance using optimized challenge items selected for their discrimination capability to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.Type: GrantFiled: June 15, 2009Date of Patent: July 23, 2013Assignee: John Nicholas and Kristin GrossInventor: John Nicholas Gross
-
Patent number: 8489399Abstract: An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.Type: GrantFiled: June 15, 2009Date of Patent: July 16, 2013Assignee: John Nicholas and Kristin Gross TrustInventor: John Nicholas Gross
-
Patent number: 8473301Abstract: A method for decoding an audio signal includes: obtaining a lower-band signal component of an audio signal corresponding to a received code stream when the audio signal switches from a first bandwidth to a second bandwidth which is narrower than the first bandwidth; extending the lower-band signal component to obtain higher-band information; performing a time-varying fadeout process on the higher-band information to obtain a processed higher-band signal component; and synthesizing the processed higher-band signal component and the obtained lower-band signal component. With the methods provided in the embodiments of the invention, when an audio signal has a switch from broadband to narrowband, a series of processes such as bandwidth detection, artificial band extension, time-varying fadeout process, and bandwidth synthesis, may be performed to make the switch to have a smooth transition from a broadband signal to a narrowband signal so that a comfortable listening experience may be achieved.Type: GrantFiled: May 1, 2010Date of Patent: June 25, 2013Assignee: Huawei Technologies Co., Ltd.Inventors: Zhe Chen, Fuliang Yin, Xiaoyu Zhang, Jinliang Dai, Libin Zhang