Voiced Or Unvoiced Patents (Class 704/208)
  • Patent number: 8924220
    Abstract: In a multiband compressor 100, a level calculation unit 121 calculates a signal level inputted for each of bands, a gain calculation unit 122 calculates a gain value from the calculated signal level, and a gain limitation unit 130 limits a gain value by comparison with a gain value of the other band in a compressor for each band. With this configuration, provided is a multiband compressor capable of achieving a balance between the quality of sound and the effect of enhancing the sound level at a high level.
    Type: Grant
    Filed: September 7, 2010
    Date of Patent: December 30, 2014
    Assignee: Lenovo Innovations Limited (Hong Kong)
    Inventor: Satoshi Hosokawa
  • Patent number: 8918324
    Abstract: A method for coding and decoding an audio signal or speech signal and an apparatus adopting the method are provided.
    Type: Grant
    Filed: January 27, 2010
    Date of Patent: December 23, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ki Hyun Choo, Jung-Hoe Kim, Eun Mi Oh, Ho Sang Sung
  • Patent number: 8909519
    Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: March 10, 2014
    Date of Patent: December 9, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 8909518
    Abstract: A warping factor estimation system comprises label information generation unit that outputs voice/non-voice label information, warp model storage unit in which a probability model representing voice and non-voice occurrence probabilities is stored, and warp estimation unit that calculates a warping factor in the frequency axis direction using the probability model representing voice and non-voice occurrence probabilities, voice and non-voice labels, and a cepstrum.
    Type: Grant
    Filed: September 22, 2008
    Date of Patent: December 9, 2014
    Assignee: NEC Corporation
    Inventor: Tadashi Emori
  • Patent number: 8903721
    Abstract: A mute setting is automatically set based on a speech detection result for acoustic signals received by a device. A device detects the speech based on a variety of cues from acoustic signals received using one or more microphones. If speech is detected within one or more frames, a mute setting may be automatically turned off. If speech is not detected, a mute setting may be automatically turned on. A mute setting may remain on as long as speech is not detected within the received acoustic signals. A varying delay may be implemented to help avoid false detections. The delay may be utilized during a mute-on state, and gradually removed during a transition from a mute-on state to a mute-off state.
    Type: Grant
    Filed: October 20, 2010
    Date of Patent: December 2, 2014
    Assignee: Audience, Inc.
    Inventor: Matthew Cowan
  • Patent number: 8898058
    Abstract: Systems, methods, apparatus, and machine-readable media for voice activity detection in a single-channel or multichannel audio signal are disclosed.
    Type: Grant
    Filed: October 24, 2011
    Date of Patent: November 25, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Jongwon Shin, Erik Visser, Ian Ernan Liu
  • Publication number: 20140343934
    Abstract: A method, apparatus, and speech synthesis system are disclosed for classifying unvoiced and voiced sound. The method includes: setting an unvoiced and voiced sound classification question set; using speech training data and the unvoiced and voiced sound classification question set for training a sound classification model of a binary decision tree structure, where the binary decision tree structure includes non-leaf nodes and leaf nodes, the non-leaf nodes represent questions in the unvoiced and voiced sound classification question set, and the leaf nodes represent unvoiced and voiced sound classification results; and receiving speech test data, and using the trained sound classification model to decide whether the speech test data is unvoiced sound or voiced sound.
    Type: Application
    Filed: February 21, 2014
    Publication date: November 20, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventor: Zongyao TANG
  • Patent number: 8886528
    Abstract: A highlight section including an exciting scene is appropriately extracted with smaller amount of processing. A reflection coefficient calculating unit (12) calculates a parameter (reflection coefficient) representing a slope of spectrum distribution of the input audio signal for each frame. A reflection coefficient comparison unit (13) calculates an amount of change in the reflection coefficients between adjacent frames, and compares the calculation result with a predetermined threshold. An audio signal classifying unit (14) classifies the input audio signal into a background noise section and a speech section based on the comparison result. A background noise level calculating unit (15) calculates a level of a background noise in the background noise section based on signal energy in the background noise section. An event detecting unit (16) detects an event occurring point from a sharp increase in the background noise level.
    Type: Grant
    Filed: June 2, 2010
    Date of Patent: November 11, 2014
    Assignee: Panasonic Corporation
    Inventor: Naoya Tanaka
  • Patent number: 8879762
    Abstract: A method and apparatus to evaluate a quality of an audio signal, in which the number of effective channels is determined for each of a reference signal of a current frame and a test signal indicative of the reference signal that has passed through an audio codec, and an audio quality evaluation score of the current frame is calculated by evaluating an audio quality of the current frame based on the determined number of effective channels for each of the reference signal and the test signal by means of a predetermined evaluator.
    Type: Grant
    Filed: January 28, 2010
    Date of Patent: November 4, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: In-Yong Choi
  • Patent number: 8868432
    Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: October 21, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Patent number: 8861746
    Abstract: A sound processing apparatus includes a target sound emphasizing unit configured to acquire a sound frequency component by emphasizing target sound in input sound in which the target sound and noise are included, a target sound suppressing unit configured to acquire a noise frequency component by suppressing the target sound in the input sound, a gain computing unit configured to compute a gain value to be multiplied by the sound frequency component using a gain function that provides a gain value and has a slope that are less than predetermined values when an energy ratio of the sound frequency component to the noise frequency component is less than or equal to a predetermined value, and a gain multiplier unit configured to multiply the sound frequency component by the gain value computed by the gain computing unit.
    Type: Grant
    Filed: March 7, 2011
    Date of Patent: October 14, 2014
    Assignee: Sony Corporation
    Inventors: Toshiyuki Sekiya, Keiichi Osako, Mototsugu Abe
  • Publication number: 20140297272
    Abstract: The present invention generally relates to intelligent voice communication systems. Specifically, this invention relates to systems and methods for providing intelligent interactive voice communication services to users of a telephony means. Preferred embodiments of the invention are directed to providing interactive voice communication services in the form of intelligent and interactive automated prank calling services.
    Type: Application
    Filed: April 2, 2013
    Publication date: October 2, 2014
    Inventor: Fahim Saleh
  • Patent number: 8831942
    Abstract: A method is provided for identifying a gender of a speaker. The method steps include obtaining speech data of the speaker, extracting vowel-like speech frames from the speech data, analyzing the vowel-like speech frames to generate a feature vector having pitch values corresponding to the vowel-like frames, analyzing the pitch values to generate a most frequent pitch value, determining, in response to the most frequent pitch value being between a first pre-determined threshold and a second pre-determined threshold, an output of a male Gaussian Mixture Model (GMM) and an output of a female GMM using the pitch values as inputs to the male GMM and the female GMM, and identifying the gender of the speaker by comparing the output of the male GMM and the output of the female GMM based on a pre-determined criterion.
    Type: Grant
    Filed: March 19, 2010
    Date of Patent: September 9, 2014
    Assignee: Narus, Inc.
    Inventor: Antonio Nucci
  • Patent number: 8825477
    Abstract: In one configuration, erasure of a significant frame of a sustained voiced segment is detected. An adaptive codebook gain value for the erased frame is calculated based on the preceding frame. If the calculated value is less than (alternatively, not greater than) a threshold value, a higher adaptive codebook gain value is used for the erased frame. The higher value may be derived from the calculated value or selected from among one or more predefined values.
    Type: Grant
    Filed: December 13, 2010
    Date of Patent: September 2, 2014
    Assignee: Qualcomm Incorporated
    Inventors: Venkatesh Krishnan, Ananthapadmanabhan Arasanipatai Kandhadai
  • Patent number: 8825478
    Abstract: Audio content is converted to text using speech recognition software. The text is then associated with a distinct voice or a generic placeholder label if no distinction can be made. From the text and voice information, a word cloud is generated based on key words and key speakers. A visualization of the cloud displays as it is being created. Words grow in size in relation to their dominance. When it is determined that the predominant words or speakers have changed, the word cloud is complete. That word cloud continues to be displayed statically and a new word cloud display begins based upon a new set of predominant words or a new predominant speaker or set of speakers. This process may continue until the meeting is concluded. At the end of the meeting, the completed visualization may be saved to a storage device, sent to selected individuals, removed, or any combination of the preceding.
    Type: Grant
    Filed: January 10, 2011
    Date of Patent: September 2, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Susan Marie Cox, Janani Janakiraman, Fang Lu, Loulwa F Salem
  • Patent number: 8805685
    Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.
    Type: Grant
    Filed: August 5, 2013
    Date of Patent: August 12, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Horst J. Schroeter
  • Patent number: 8805694
    Abstract: A method and an apparatus for encoding and decoding audio signals using adaptive sinusoidal coding are provided. The audio signal encoding method includes the steps of dividing a synthesized audio signal into a plurality of sub-bands, calculating the energy of each sub-band, selecting a predetermined number of sub-bands having a relatively large amount of energy from the sub-bands, and performing sinusoidal coding with regard to the selected sub-bands. Application of sinusoidal coding based on consideration of the amount of energy of each sub-band of the synthesized signal improves the quality of the synthesized signal more efficiently.
    Type: Grant
    Filed: February 16, 2010
    Date of Patent: August 12, 2014
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Mi-Suk Lee, Hyun-Joo Bae, Byung-Sun Lee
  • Publication number: 20140222421
    Abstract: A speech-synthesizing device includes a hierarchical prosodic module, a prosody-analyzing device, and a prosody-synthesizing unit. The hierarchical prosodic module generates at least a first hierarchical prosodic model. The prosody-analyzing device receives a low-level linguistic feature, a high-level linguistic feature and a first prosodic feature, and generates at least a prosodic tag based on the low-level linguistic feature, the high-level linguistic feature, the first prosodic feature and the first hierarchical prosodic model. The prosody-synthesizing unit synthesizes a second prosodic feature based on the hierarchical prosodic module, the low-level linguistic feature and the prosodic tag.
    Type: Application
    Filed: January 30, 2014
    Publication date: August 7, 2014
    Applicant: National Chiao Tung University
    Inventors: Sin-Horng Chen, Yih-Ru Wang, Chen-Yu Chiang, Chiao-Hua Hsieh
  • Patent number: 8798991
    Abstract: A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value.
    Type: Grant
    Filed: November 13, 2012
    Date of Patent: August 5, 2014
    Assignee: Fujitsu Limited
    Inventors: Nobuyuki Washio, Shoji Hayakawa
  • Patent number: 8793124
    Abstract: A scheme to judge emphasized speech portions, wherein the judgment is executed by a statistical processing in terms of a set of speech parameters including a fundamental frequency, power and a temporal variation of a dynamic measure and/or their derivatives. The emphasized speech portions are used for clues to summarize an audio content or a video content with a speech.
    Type: Grant
    Filed: April 5, 2006
    Date of Patent: July 29, 2014
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Kota Hidaka, Shinya Nakajima, Osamu Mizuno, Hidetaka Kuwano, Haruhiko Kojima
  • Patent number: 8792777
    Abstract: The present invention is directed to system(s), method(s), and apparatus for accurate fast forward rate when performing trick play with variable distance between frames. In one embodiment, there is presented a circuit for providing a fast forward video sequence. The circuit comprises a system time clock for providing a time reference, said time reference incremented at a predetermined fast forward rate; a comparator for comparing the time reference with timing information associated with a picture; and a controller for determining whether to display the picture based at least in part on the comparison between the timing information and the time reference.
    Type: Grant
    Filed: January 10, 2007
    Date of Patent: July 29, 2014
    Assignee: Broadcom Corporation
    Inventor: Tim Ross
  • Patent number: 8775168
    Abstract: A Yule-Walker based, low-complexity voice activity detector (VAD) is disclosed. An input signal is typically noisy speech (i.e., corrupted with, for example, babble noise). In one embodiment, a first initialization stage of the VAD computes an occurrence of a silent period within the input signal and the AR parameters. The VAD could accordingly compute a tentative adaptive threshold and output hypothesis H1 (which means speech is present) during this stage. During the second initialization stage, the VAD generally builds a database of associated values and computes the adaptive threshold accordingly. The second initialization stage could also output tentative VAD decisions based on the tentative threshold computed in the first initialization stage. Finally, the VAD periodically retrains or updates AR parameters, threshold values and/or the database and outputs VAD decisions accordingly.
    Type: Grant
    Filed: August 3, 2007
    Date of Patent: July 8, 2014
    Assignee: STMicroelectronics Asia Pacific PTE, Ltd.
    Inventors: Karthik Muralidhar, Anoop Kumar Krishna
  • Publication number: 20140180683
    Abstract: Systems and methods for adjusting pitch of an audio signal include detecting input notes in the audio signal, mapping the input notes to corresponding output notes, each output note having an associated upper note boundary and lower note boundary, and modifying at least one of the upper note boundary and the lower note boundary of at least one output note in response to previously received input notes. Pitch of the input notes may be shifted to match an associated pitch of corresponding output notes. Delay of the pitch shifting process may be dynamically adjusted based on detected stability of the input notes.
    Type: Application
    Filed: December 21, 2012
    Publication date: June 26, 2014
    Applicant: HARMAN INTERNATIONAL INDUSTRIES, INC.
    Inventors: Peter R. Lupini, Glen A. Rutledge, Norm Campbell
  • Patent number: 8762139
    Abstract: A noise suppression device includes: a power spectrum calculator converting an input signal of time domain into power spectra of frequency domain; a voice/noise determination unit determining whether the power spectra indicate voice or noise; a noise spectrum estimation unit estimating noise spectra of the power spectra; a period component estimation unit analyzing a harmonic structure constituting the power spectra and estimating periodical information about the power spectra; a weighting coefficient calculator calculating a weighting coefficient for weighting the power spectra; a suppression coefficient calculator calculating a suppression coefficient for suppressing noise included in the power spectra; a spectrum suppression unit suppressing amplitude of the power spectra in accordance with the suppression coefficient; and an inverse Fourier transformer converting the power spectra output by the spectrum suppression unit into a signal of time domain to generate a noise-suppressed signal.
    Type: Grant
    Filed: September 21, 2010
    Date of Patent: June 24, 2014
    Assignee: Mitsubishi Electric Corporation
    Inventors: Satoru Furuta, Hirohisa Tasaki
  • Patent number: 8762158
    Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.
    Type: Grant
    Filed: August 5, 2011
    Date of Patent: June 24, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
  • Patent number: 8751221
    Abstract: A communication apparatus for adjusting a received voice signal in accordance with an ambient noise, the communication apparatus includes: a microphone for receiving an ambient noise and input voice and outputting a voice input signal corresponding to a level of the input voice and the ambient noise; a receiver for receiving the voice signal; a processer for extracting a voice component originated by a sender and an ambient noise component originated by the ambient noise, determining the ratio between the voice component and the ambient noise component, and adjusting the amplitude of the received voice signal in accordance with the ratio; and a speaker for outputting a reception voice corresponding to the adjusted reception voice signal.
    Type: Grant
    Filed: March 23, 2009
    Date of Patent: June 10, 2014
    Assignee: Fujitsu Limited
    Inventors: Kaori Endo, Yasuji Ota, Takeshi Otani, Taro Togawa
  • Patent number: 8744842
    Abstract: A robust method and apparatus to detect voice activity based on the power level of an audio frame. The method may include performing primary active/non-active voice period determination of an input audio frame according to a power level of the audio frame, extracting a noise power prediction value and a signal power prediction value by referring to power levels of current and previous audio frames according to a primary active/non-active voice period determination value, and performing secondary active/non-active voice period determination for the input audio frame by comparing the extracted signal power prediction value with the extracted noise power prediction value.
    Type: Grant
    Filed: May 28, 2008
    Date of Patent: June 3, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Jae-youn Cho
  • Patent number: 8738367
    Abstract: A speech signal processing device is equipped with a power acquisition unit, a probability distribution acquisition unit, and a correspondence degree determination unit. The power acquisition unit accepts an inputted speech signal and, based on the accepted speech signal, acquires power representing the intensity of a speech sound represented by the speech signal. The probability distribution acquisition unit acquires a probability distribution using the intensity of the power acquired by the power acquisition unit as a random variable. The correspondence degree determination unit determines whether a correspondence degree representing a degree that power acquired by the power acquisition unit in a case that a predetermined reference speech signal is inputted into the power acquisition unit corresponds with predetermined reference power is higher than a predetermined reference correspondence degree, based on the probability distribution acquired by the probability distribution acquisition unit.
    Type: Grant
    Filed: February 18, 2010
    Date of Patent: May 27, 2014
    Assignee: NEC Corporation
    Inventor: Tadashi Emori
  • Publication number: 20140142932
    Abstract: Embodiments of the present invention provide a method for producing an audio file and a terminal device. The method includes recording a user's voice to obtain audio information, generating a score curve according to the audio information, and displaying the score curve; receiving a polishing instruction that is sent by the user by operating the score curve, and adjusting the audio information according to the polishing instruction, and generating an audio file. The technical solutions provided in the present invention enable the user to create a song of himself or herself on the terminal device, thereby improving functions of the terminal device and meeting an application requirement of the user.
    Type: Application
    Filed: October 14, 2013
    Publication date: May 22, 2014
    Applicant: Huawei Technologies Co., Ltd.
    Inventor: Rui Li
  • Patent number: 8731912
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for audible alert tones are disclosed. The methods, systems, and apparatus include actions of determining whether audio input data received after ceasing output of a first instance of an audible alert tone includes voice activity and determining whether to delay a successive instance of the audible alert tone based on determining whether the audio input data includes voice activity.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: May 20, 2014
    Assignee: Google Inc.
    Inventors: Simon Tickner, Peter J Hodgson, Richard Z. Cohen
  • Patent number: 8725498
    Abstract: A computer-implemented method for digital speech processing, including (1) receiving, at a server computer, digital speech data from a computing device, the digital speech data comprising data points sampled at respective time points; (2) computing, by the server computer, a tonal feature of the digital speech data, the tonal feature comprising information encoding fundamental frequencies at the respective time points; (3) computing, by the server computer, a logarithm of the tonal feature at the respective time points; and (4) processing, by the server computer, the logarithm of the tonal feature based on a characterization of the digital speech data at the respective time points.
    Type: Grant
    Filed: July 24, 2012
    Date of Patent: May 13, 2014
    Assignee: Google Inc.
    Inventors: Yun-hsuan Sung, Meihong Wang, Xin Lei
  • Patent number: 8725506
    Abstract: A speech processing engine is provided that in some embodiments, employs Kalman filtering with a particular speaker's glottal information to clean up an audio speech signal for more efficient automatic speech recognition.
    Type: Grant
    Filed: June 30, 2010
    Date of Patent: May 13, 2014
    Assignee: Intel Corporation
    Inventors: Willem M. Beltman, Matias Zanartu, Arijit Raychowdhury, Anand P. Rangarajan, Michael E. Deisher
  • Patent number: 8719019
    Abstract: Speaker identification techniques are described. In one or more implementations, sample data is received at a computing device of one or more user utterances captured using a microphone. The sample data is processed by the computing device to identify a speaker of the one or more user utterances. The processing involving use of a feature set that includes features obtained using a filterbank having filters that space linearly at higher frequencies and logarithmically at lower frequencies, respectively, features that model the speaker's vocal tract transfer function, and features that indicate a vibration rate of vocal folds of the speaker of the sample data.
    Type: Grant
    Filed: April 25, 2011
    Date of Patent: May 6, 2014
    Assignee: Microsoft Corporation
    Inventors: Hoang T. Do, Ivan J. Tashev, Alejandro Acero, Jason S. Flaks, Robert N. Heitkamp, Molly R. Suver
  • Patent number: 8712765
    Abstract: A parameter decoding apparatus includes a prediction residue decoder that finds a quantized prediction residue based on encoded information included in a current frame subject to decoding and a moving-average predictor produces a predicted parameter by multiplying a predictive coefficient with a past quantized prediction residue. An adder decodes a parameter by adding the quantized prediction residue and the predicted parameter, wherein the prediction residue decoder, when the current frame is erased, finds a current-frame quantized prediction residue from a weighted linear sum of a parameter decoded in the past and a future-frame quantized prediction residue.
    Type: Grant
    Filed: May 17, 2013
    Date of Patent: April 29, 2014
    Assignee: Panasonic Corporation
    Inventor: Hiroyuki Ehara
  • Patent number: 8712768
    Abstract: A method, device, system, and computer program product expand narrowband speech signals to wideband speech signals. The method includes determining signal type information from a signal, obtaining characteristics for forming an upper band signal using the determined signal type information, determining signal noise information, using the determined signal noise information to modify the obtained characteristics for forming the upper band signal, and forming the upper band signal using the modified characteristics.
    Type: Grant
    Filed: May 25, 2004
    Date of Patent: April 29, 2014
    Assignee: Nokia Corporation
    Inventors: Laura Laaksonen, Päivi Valve
  • Publication number: 20140114653
    Abstract: An apparatus comprising an analysis window definer configured to define at least one analysis window for a first audio signal, wherein the at least one analysis window definer is configured to be dependent on the first audio signal and a pitch estimator configured to determine a first pitch estimate for the first audio signal, wherein the pitch estimator is dependent on the first audio signal sample values within the analysis window.
    Type: Application
    Filed: May 6, 2011
    Publication date: April 24, 2014
    Applicant: Nokia Corporation
    Inventors: Lasse Juhani Laaksonen, Anssi Sakari Rämö, Adriana Vasilache, Mikko Tapio Tammi
  • Patent number: 8706488
    Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.
    Type: Grant
    Filed: February 27, 2013
    Date of Patent: April 22, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
  • Patent number: 8700390
    Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: October 7, 2013
    Date of Patent: April 15, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 8694326
    Abstract: A communication terminal includes a decoder which decodes an input bitstream received from another communication terminal, to generate an output audio signal and outputs the generated output audio signal to a speaker; and an echo canceller which obtains an input audio signal representing sound captured by a microphone placed in a space to which the speaker outputs the sound, and removes, for respective subbands, an echo component included in the obtained input audio signal and corresponding to the output audio signal, to generate an audio signal for transmission. An encoder codes the audio signal for transmission to generate an output bitstream and transmits the generated output bitstream to another communication terminal; and a control unit controls, for the respective subbands, echo cancellation processing according to a reproduction band of at least one of the output audio signal and the audio signal for transmission.
    Type: Grant
    Filed: August 21, 2012
    Date of Patent: April 8, 2014
    Assignee: Panasonic Corporation
    Inventors: Shuji Miyasaka, Kosuke Nishio, Ichiro Kawashima
  • Patent number: 8682662
    Abstract: In accordance with an example embodiment of the invention, there is provided an apparatus for detecting voice activity in an audio signal. The apparatus comprises a first voice activity detector for making a first voice activity detection decision based at least in part on the voice activity of a first audio signal received from a first microphone. The apparatus also comprises a second voice activity detector for making a second voice activity detection decision based at least in part on an estimate of a direction of the first audio signal and an estimate of a direction of a second audio signal received from a second microphone. The apparatus further comprises a classifier for making a third voice activity detection decision based at least in part on the first and second voice activity detection decisions.
    Type: Grant
    Filed: August 13, 2012
    Date of Patent: March 25, 2014
    Assignee: Nokia Corporation
    Inventors: Riitta Elina Niemistö, Päivi Marianna Valve
  • Publication number: 20140081629
    Abstract: The quality of encoded signals can be improved by reclassifying AUDIO signals carrying non-speech data as VOICE signals when periodicity parameters of the signal satisfy one or more criteria. In some embodiments, only low or medium bit rate signals are considered for re-classification. The periodicity parameters can include any characteristic or set of characteristics indicative of periodicity. For example, the periodicity parameter may include pitch differences between subframes in the audio signal, a normalized pitch correlation for one or more subframes, an average normalized pitch correlation for the audio signal, or combinations thereof. Audio signals which are re-classified as VOICED signals may be encoded in the time-domain, while audio signals that remain classified as AUDIO signals may be encoded in the frequency-domain.
    Type: Application
    Filed: September 13, 2013
    Publication date: March 20, 2014
    Inventor: Yang Gao
  • Patent number: 8676572
    Abstract: A computer-implemented system and method for enhancing audio to individuals participating in a conversation is provided. Audio data for individuals participating in one or more conversations is analyzed. Possible conversational configurations of the individuals are generated based on the audio data, and each possible conversational configuration includes one or more subconfigurations of at least two of the individuals. A probability weight is assigned to each of the subconfigurations and includes a likelihood that the individuals of that subconfiguration are participating in one of the conversations. A probability of each possible conversational configuration is determined by combining the probability weights for the subconfigurations of that possible conversational configuration. The possible conversational configuration with the highest probability is selected as a most probable configuration. The individuals participating in the conversations are determined based on the most probable configuration.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: March 18, 2014
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Paul M. Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison G. Woodruff
  • Patent number: 8670980
    Abstract: A tone determination device, which determines the tonality of an input signal, is capable of reducing calculation complexity. Therein a frequency conversion unit (101) converts the frequency of an input signal; a downsampling unit (102) carries out shortening processing which shortens the vector series length of the frequency-converted signal; a constancy determination unit (107) determines the constancy of the input signal; depending on the constancy of the input signal, a vector selection unit (104) selects either the vector series of the post-frequency conversion signal or the vector series after the shortening of the vector series length; a correlation analysis unit (105) uses the vector series selected by the vector selection unit (104) to obtain correlations; and a tone determination unit (106) uses the correlations to determine the tonality of the input signal.
    Type: Grant
    Filed: October 26, 2010
    Date of Patent: March 11, 2014
    Assignee: Panasonic Corporation
    Inventor: Kaoru Satoh
  • Patent number: 8665311
    Abstract: Improved methods and apparatus for sharing and collaborating around a video source by maintaining and providing to users a list of a plurality of contacts containing both video source device contacts and interactive message source contacts. This allows for collaboration among users by permitting them to communicate with each other around an automatically-shared video source, to interact with automatically shared video sources, and to make decisions based on these shared video sources.
    Type: Grant
    Filed: February 17, 2011
    Date of Patent: March 4, 2014
    Assignee: vBrick Systems, Inc.
    Inventors: Erik Herz, Douglas Uhl
  • Patent number: 8654761
    Abstract: In one embodiment, a method can include: (i) establishing an internet protocol (IP) connection; (ii) forming a buffered version of a plurality of voice frame slices from received audio packets; and (iii) when an erasure is detected, performing a packet loss concealment (PLC) to provide a synthesized speech signal for the erasure, where the PLC can include: (a) identifying first and second pitches from the buffered version of the plurality of voice frame slices; and (b) forming the synthesized speech signal by using the first and second pitches, and more if needed, followed by an overlay-add (OLA).
    Type: Grant
    Filed: December 17, 2012
    Date of Patent: February 18, 2014
    Assignee: Cisco Technology, Inc.
    Inventors: Duanpei Wu, Luke K. Surazski
  • Patent number: 8655320
    Abstract: A voice messaging system includes a transceiver, an indicator, a microphone, and a speaker. The transceiver is operable to receive a message from the Internet, and the indicator is operable to announce that the message has been received. The microphone is operable to receive a verbal request to play the message, and the speaker is operable to play the recorded message in response to receiving the verbal request.
    Type: Grant
    Filed: April 14, 2009
    Date of Patent: February 18, 2014
    Assignee: CA, Inc.
    Inventors: Christopher J. Stakutis, Thomas M. Boyle, Steven L. Greenspan
  • Publication number: 20140046658
    Abstract: An audio classifier for frame based audio signal classification includes a feature extractor configured to determine, for each of a predetermined number of consecutive frames, feature measures representing at least the following features: auto correlation, frame signal energy, inter-frame signal energy variation. A feature measure comparator is configured to compare each determined feature measure to at least one corresponding predetermined feature interval. A frame classifier is configured to calculate, for each feature interval, a fraction measure representing the total number of corresponding feature measures that fall within the feature interval, and to classify the latest of the consecutive frames as speech if each fraction measure lies within a corresponding fraction interval, and as non-speech otherwise.
    Type: Application
    Filed: April 28, 2011
    Publication date: February 13, 2014
    Applicant: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)
    Inventors: Volodya Grancharov, Sebastian Näslund
  • Patent number: 8649283
    Abstract: A method executed by a packet analysis apparatus for analyzing packets including voice packets and non-voice packets includes: capturing packets in a specific session; storing the captured packets in a storage; screening the stored packets to count up a receipt count of voice packets; determining whether packet loss has occurred in the specific session; and determining whether loss packets are voice packets in accordance with received packets adjacent to the loss packets to count up a loss count of voice packets when the packet loss has occurred.
    Type: Grant
    Filed: June 14, 2010
    Date of Patent: February 11, 2014
    Assignee: Fujitsu Limited
    Inventors: Sumiyo Okada, Noriyuki Fukuyama, Masanobu Morinaga, Hideaki Miyazaki
  • Publication number: 20140006017
    Abstract: Arrangements are described that may be used to reduce the intelligibility of speech using masker signals which are obfuscated yet correlated versions of the speech. Other applications of pitch analysis and demodulation are also described. A system may be used to drive an array of loudspeakers to produce a sound field that includes a source component, whose energy is concentrated along a first direction relative to the array, and a masking component that is based on an estimated intensity of the source component in a second direction that is different from the first direction.
    Type: Application
    Filed: February 28, 2013
    Publication date: January 2, 2014
    Applicant: QUALCOMM Incorporated
    Inventor: Dipanjan Sen
  • Publication number: 20140006018
    Abstract: In a voice processing apparatus, a processor is configured to adjust, a fundamental frequency of a first voice signal corresponding to a voice having target voice characteristics to a fundamental frequency of a second voice signal corresponding to a voice having initial voice characteristics different from the target voice characteristics. The processor is further configured to sequentially generate a processed spectrum based on a spectrum of the first voice signal and a spectrum of the second voice signal by: dividing the spectrum of the first voice signal into a plurality of harmonic band components after the fundamental frequency of the first voice signal has been adjusted; allocating each harmonic band component of the first voice signal to each harmonic frequency associated with the fundamental frequency of the second voice signal; and adjusting an envelope and phase of each harmonic band component according to the spectrum of the second voice signal.
    Type: Application
    Filed: June 20, 2013
    Publication date: January 2, 2014
    Applicant: Yamaha Corporation
    Inventors: Jordi BONADA, Merlijn BLAAUW, Yuji HISAMINATO