Voiced Or Unvoiced Patents (Class 704/208)
-
Patent number: 8924220Abstract: In a multiband compressor 100, a level calculation unit 121 calculates a signal level inputted for each of bands, a gain calculation unit 122 calculates a gain value from the calculated signal level, and a gain limitation unit 130 limits a gain value by comparison with a gain value of the other band in a compressor for each band. With this configuration, provided is a multiband compressor capable of achieving a balance between the quality of sound and the effect of enhancing the sound level at a high level.Type: GrantFiled: September 7, 2010Date of Patent: December 30, 2014Assignee: Lenovo Innovations Limited (Hong Kong)Inventor: Satoshi Hosokawa
-
Patent number: 8918324Abstract: A method for coding and decoding an audio signal or speech signal and an apparatus adopting the method are provided.Type: GrantFiled: January 27, 2010Date of Patent: December 23, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Ki Hyun Choo, Jung-Hoe Kim, Eun Mi Oh, Ho Sang Sung
-
Patent number: 8909519Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.Type: GrantFiled: March 10, 2014Date of Patent: December 9, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Bing Chen, James H. James
-
Patent number: 8909518Abstract: A warping factor estimation system comprises label information generation unit that outputs voice/non-voice label information, warp model storage unit in which a probability model representing voice and non-voice occurrence probabilities is stored, and warp estimation unit that calculates a warping factor in the frequency axis direction using the probability model representing voice and non-voice occurrence probabilities, voice and non-voice labels, and a cepstrum.Type: GrantFiled: September 22, 2008Date of Patent: December 9, 2014Assignee: NEC CorporationInventor: Tadashi Emori
-
Patent number: 8903721Abstract: A mute setting is automatically set based on a speech detection result for acoustic signals received by a device. A device detects the speech based on a variety of cues from acoustic signals received using one or more microphones. If speech is detected within one or more frames, a mute setting may be automatically turned off. If speech is not detected, a mute setting may be automatically turned on. A mute setting may remain on as long as speech is not detected within the received acoustic signals. A varying delay may be implemented to help avoid false detections. The delay may be utilized during a mute-on state, and gradually removed during a transition from a mute-on state to a mute-off state.Type: GrantFiled: October 20, 2010Date of Patent: December 2, 2014Assignee: Audience, Inc.Inventor: Matthew Cowan
-
Patent number: 8898058Abstract: Systems, methods, apparatus, and machine-readable media for voice activity detection in a single-channel or multichannel audio signal are disclosed.Type: GrantFiled: October 24, 2011Date of Patent: November 25, 2014Assignee: QUALCOMM IncorporatedInventors: Jongwon Shin, Erik Visser, Ian Ernan Liu
-
Publication number: 20140343934Abstract: A method, apparatus, and speech synthesis system are disclosed for classifying unvoiced and voiced sound. The method includes: setting an unvoiced and voiced sound classification question set; using speech training data and the unvoiced and voiced sound classification question set for training a sound classification model of a binary decision tree structure, where the binary decision tree structure includes non-leaf nodes and leaf nodes, the non-leaf nodes represent questions in the unvoiced and voiced sound classification question set, and the leaf nodes represent unvoiced and voiced sound classification results; and receiving speech test data, and using the trained sound classification model to decide whether the speech test data is unvoiced sound or voiced sound.Type: ApplicationFiled: February 21, 2014Publication date: November 20, 2014Applicant: Tencent Technology (Shenzhen) Company LimitedInventor: Zongyao TANG
-
Patent number: 8886528Abstract: A highlight section including an exciting scene is appropriately extracted with smaller amount of processing. A reflection coefficient calculating unit (12) calculates a parameter (reflection coefficient) representing a slope of spectrum distribution of the input audio signal for each frame. A reflection coefficient comparison unit (13) calculates an amount of change in the reflection coefficients between adjacent frames, and compares the calculation result with a predetermined threshold. An audio signal classifying unit (14) classifies the input audio signal into a background noise section and a speech section based on the comparison result. A background noise level calculating unit (15) calculates a level of a background noise in the background noise section based on signal energy in the background noise section. An event detecting unit (16) detects an event occurring point from a sharp increase in the background noise level.Type: GrantFiled: June 2, 2010Date of Patent: November 11, 2014Assignee: Panasonic CorporationInventor: Naoya Tanaka
-
Patent number: 8879762Abstract: A method and apparatus to evaluate a quality of an audio signal, in which the number of effective channels is determined for each of a reference signal of a current frame and a test signal indicative of the reference signal that has passed through an audio codec, and an audio quality evaluation score of the current frame is calculated by evaluating an audio quality of the current frame based on the determined number of effective channels for each of the reference signal and the test signal by means of a predetermined evaluator.Type: GrantFiled: January 28, 2010Date of Patent: November 4, 2014Assignee: Samsung Electronics Co., Ltd.Inventor: In-Yong Choi
-
Patent number: 8868432Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.Type: GrantFiled: September 28, 2011Date of Patent: October 21, 2014Assignee: Motorola Mobility LLCInventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
-
Patent number: 8861746Abstract: A sound processing apparatus includes a target sound emphasizing unit configured to acquire a sound frequency component by emphasizing target sound in input sound in which the target sound and noise are included, a target sound suppressing unit configured to acquire a noise frequency component by suppressing the target sound in the input sound, a gain computing unit configured to compute a gain value to be multiplied by the sound frequency component using a gain function that provides a gain value and has a slope that are less than predetermined values when an energy ratio of the sound frequency component to the noise frequency component is less than or equal to a predetermined value, and a gain multiplier unit configured to multiply the sound frequency component by the gain value computed by the gain computing unit.Type: GrantFiled: March 7, 2011Date of Patent: October 14, 2014Assignee: Sony CorporationInventors: Toshiyuki Sekiya, Keiichi Osako, Mototsugu Abe
-
Publication number: 20140297272Abstract: The present invention generally relates to intelligent voice communication systems. Specifically, this invention relates to systems and methods for providing intelligent interactive voice communication services to users of a telephony means. Preferred embodiments of the invention are directed to providing interactive voice communication services in the form of intelligent and interactive automated prank calling services.Type: ApplicationFiled: April 2, 2013Publication date: October 2, 2014Inventor: Fahim Saleh
-
Patent number: 8831942Abstract: A method is provided for identifying a gender of a speaker. The method steps include obtaining speech data of the speaker, extracting vowel-like speech frames from the speech data, analyzing the vowel-like speech frames to generate a feature vector having pitch values corresponding to the vowel-like frames, analyzing the pitch values to generate a most frequent pitch value, determining, in response to the most frequent pitch value being between a first pre-determined threshold and a second pre-determined threshold, an output of a male Gaussian Mixture Model (GMM) and an output of a female GMM using the pitch values as inputs to the male GMM and the female GMM, and identifying the gender of the speaker by comparing the output of the male GMM and the output of the female GMM based on a pre-determined criterion.Type: GrantFiled: March 19, 2010Date of Patent: September 9, 2014Assignee: Narus, Inc.Inventor: Antonio Nucci
-
Patent number: 8825477Abstract: In one configuration, erasure of a significant frame of a sustained voiced segment is detected. An adaptive codebook gain value for the erased frame is calculated based on the preceding frame. If the calculated value is less than (alternatively, not greater than) a threshold value, a higher adaptive codebook gain value is used for the erased frame. The higher value may be derived from the calculated value or selected from among one or more predefined values.Type: GrantFiled: December 13, 2010Date of Patent: September 2, 2014Assignee: Qualcomm IncorporatedInventors: Venkatesh Krishnan, Ananthapadmanabhan Arasanipatai Kandhadai
-
Patent number: 8825478Abstract: Audio content is converted to text using speech recognition software. The text is then associated with a distinct voice or a generic placeholder label if no distinction can be made. From the text and voice information, a word cloud is generated based on key words and key speakers. A visualization of the cloud displays as it is being created. Words grow in size in relation to their dominance. When it is determined that the predominant words or speakers have changed, the word cloud is complete. That word cloud continues to be displayed statically and a new word cloud display begins based upon a new set of predominant words or a new predominant speaker or set of speakers. This process may continue until the meeting is concluded. At the end of the meeting, the completed visualization may be saved to a storage device, sent to selected individuals, removed, or any combination of the preceding.Type: GrantFiled: January 10, 2011Date of Patent: September 2, 2014Assignee: Nuance Communications, Inc.Inventors: Susan Marie Cox, Janani Janakiraman, Fang Lu, Loulwa F Salem
-
Patent number: 8805685Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.Type: GrantFiled: August 5, 2013Date of Patent: August 12, 2014Assignee: AT&T Intellectual Property I, L.P.Inventor: Horst J. Schroeter
-
Patent number: 8805694Abstract: A method and an apparatus for encoding and decoding audio signals using adaptive sinusoidal coding are provided. The audio signal encoding method includes the steps of dividing a synthesized audio signal into a plurality of sub-bands, calculating the energy of each sub-band, selecting a predetermined number of sub-bands having a relatively large amount of energy from the sub-bands, and performing sinusoidal coding with regard to the selected sub-bands. Application of sinusoidal coding based on consideration of the amount of energy of each sub-band of the synthesized signal improves the quality of the synthesized signal more efficiently.Type: GrantFiled: February 16, 2010Date of Patent: August 12, 2014Assignee: Electronics and Telecommunications Research InstituteInventors: Mi-Suk Lee, Hyun-Joo Bae, Byung-Sun Lee
-
Publication number: 20140222421Abstract: A speech-synthesizing device includes a hierarchical prosodic module, a prosody-analyzing device, and a prosody-synthesizing unit. The hierarchical prosodic module generates at least a first hierarchical prosodic model. The prosody-analyzing device receives a low-level linguistic feature, a high-level linguistic feature and a first prosodic feature, and generates at least a prosodic tag based on the low-level linguistic feature, the high-level linguistic feature, the first prosodic feature and the first hierarchical prosodic model. The prosody-synthesizing unit synthesizes a second prosodic feature based on the hierarchical prosodic module, the low-level linguistic feature and the prosodic tag.Type: ApplicationFiled: January 30, 2014Publication date: August 7, 2014Applicant: National Chiao Tung UniversityInventors: Sin-Horng Chen, Yih-Ru Wang, Chen-Yu Chiang, Chiao-Hua Hsieh
-
Patent number: 8798991Abstract: A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value.Type: GrantFiled: November 13, 2012Date of Patent: August 5, 2014Assignee: Fujitsu LimitedInventors: Nobuyuki Washio, Shoji Hayakawa
-
Patent number: 8793124Abstract: A scheme to judge emphasized speech portions, wherein the judgment is executed by a statistical processing in terms of a set of speech parameters including a fundamental frequency, power and a temporal variation of a dynamic measure and/or their derivatives. The emphasized speech portions are used for clues to summarize an audio content or a video content with a speech.Type: GrantFiled: April 5, 2006Date of Patent: July 29, 2014Assignee: Nippon Telegraph and Telephone CorporationInventors: Kota Hidaka, Shinya Nakajima, Osamu Mizuno, Hidetaka Kuwano, Haruhiko Kojima
-
Patent number: 8792777Abstract: The present invention is directed to system(s), method(s), and apparatus for accurate fast forward rate when performing trick play with variable distance between frames. In one embodiment, there is presented a circuit for providing a fast forward video sequence. The circuit comprises a system time clock for providing a time reference, said time reference incremented at a predetermined fast forward rate; a comparator for comparing the time reference with timing information associated with a picture; and a controller for determining whether to display the picture based at least in part on the comparison between the timing information and the time reference.Type: GrantFiled: January 10, 2007Date of Patent: July 29, 2014Assignee: Broadcom CorporationInventor: Tim Ross
-
Patent number: 8775168Abstract: A Yule-Walker based, low-complexity voice activity detector (VAD) is disclosed. An input signal is typically noisy speech (i.e., corrupted with, for example, babble noise). In one embodiment, a first initialization stage of the VAD computes an occurrence of a silent period within the input signal and the AR parameters. The VAD could accordingly compute a tentative adaptive threshold and output hypothesis H1 (which means speech is present) during this stage. During the second initialization stage, the VAD generally builds a database of associated values and computes the adaptive threshold accordingly. The second initialization stage could also output tentative VAD decisions based on the tentative threshold computed in the first initialization stage. Finally, the VAD periodically retrains or updates AR parameters, threshold values and/or the database and outputs VAD decisions accordingly.Type: GrantFiled: August 3, 2007Date of Patent: July 8, 2014Assignee: STMicroelectronics Asia Pacific PTE, Ltd.Inventors: Karthik Muralidhar, Anoop Kumar Krishna
-
Publication number: 20140180683Abstract: Systems and methods for adjusting pitch of an audio signal include detecting input notes in the audio signal, mapping the input notes to corresponding output notes, each output note having an associated upper note boundary and lower note boundary, and modifying at least one of the upper note boundary and the lower note boundary of at least one output note in response to previously received input notes. Pitch of the input notes may be shifted to match an associated pitch of corresponding output notes. Delay of the pitch shifting process may be dynamically adjusted based on detected stability of the input notes.Type: ApplicationFiled: December 21, 2012Publication date: June 26, 2014Applicant: HARMAN INTERNATIONAL INDUSTRIES, INC.Inventors: Peter R. Lupini, Glen A. Rutledge, Norm Campbell
-
Patent number: 8762139Abstract: A noise suppression device includes: a power spectrum calculator converting an input signal of time domain into power spectra of frequency domain; a voice/noise determination unit determining whether the power spectra indicate voice or noise; a noise spectrum estimation unit estimating noise spectra of the power spectra; a period component estimation unit analyzing a harmonic structure constituting the power spectra and estimating periodical information about the power spectra; a weighting coefficient calculator calculating a weighting coefficient for weighting the power spectra; a suppression coefficient calculator calculating a suppression coefficient for suppressing noise included in the power spectra; a spectrum suppression unit suppressing amplitude of the power spectra in accordance with the suppression coefficient; and an inverse Fourier transformer converting the power spectra output by the spectrum suppression unit into a signal of time domain to generate a noise-suppressed signal.Type: GrantFiled: September 21, 2010Date of Patent: June 24, 2014Assignee: Mitsubishi Electric CorporationInventors: Satoru Furuta, Hirohisa Tasaki
-
Patent number: 8762158Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.Type: GrantFiled: August 5, 2011Date of Patent: June 24, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
-
Patent number: 8751221Abstract: A communication apparatus for adjusting a received voice signal in accordance with an ambient noise, the communication apparatus includes: a microphone for receiving an ambient noise and input voice and outputting a voice input signal corresponding to a level of the input voice and the ambient noise; a receiver for receiving the voice signal; a processer for extracting a voice component originated by a sender and an ambient noise component originated by the ambient noise, determining the ratio between the voice component and the ambient noise component, and adjusting the amplitude of the received voice signal in accordance with the ratio; and a speaker for outputting a reception voice corresponding to the adjusted reception voice signal.Type: GrantFiled: March 23, 2009Date of Patent: June 10, 2014Assignee: Fujitsu LimitedInventors: Kaori Endo, Yasuji Ota, Takeshi Otani, Taro Togawa
-
Patent number: 8744842Abstract: A robust method and apparatus to detect voice activity based on the power level of an audio frame. The method may include performing primary active/non-active voice period determination of an input audio frame according to a power level of the audio frame, extracting a noise power prediction value and a signal power prediction value by referring to power levels of current and previous audio frames according to a primary active/non-active voice period determination value, and performing secondary active/non-active voice period determination for the input audio frame by comparing the extracted signal power prediction value with the extracted noise power prediction value.Type: GrantFiled: May 28, 2008Date of Patent: June 3, 2014Assignee: Samsung Electronics Co., Ltd.Inventor: Jae-youn Cho
-
Patent number: 8738367Abstract: A speech signal processing device is equipped with a power acquisition unit, a probability distribution acquisition unit, and a correspondence degree determination unit. The power acquisition unit accepts an inputted speech signal and, based on the accepted speech signal, acquires power representing the intensity of a speech sound represented by the speech signal. The probability distribution acquisition unit acquires a probability distribution using the intensity of the power acquired by the power acquisition unit as a random variable. The correspondence degree determination unit determines whether a correspondence degree representing a degree that power acquired by the power acquisition unit in a case that a predetermined reference speech signal is inputted into the power acquisition unit corresponds with predetermined reference power is higher than a predetermined reference correspondence degree, based on the probability distribution acquired by the probability distribution acquisition unit.Type: GrantFiled: February 18, 2010Date of Patent: May 27, 2014Assignee: NEC CorporationInventor: Tadashi Emori
-
Publication number: 20140142932Abstract: Embodiments of the present invention provide a method for producing an audio file and a terminal device. The method includes recording a user's voice to obtain audio information, generating a score curve according to the audio information, and displaying the score curve; receiving a polishing instruction that is sent by the user by operating the score curve, and adjusting the audio information according to the polishing instruction, and generating an audio file. The technical solutions provided in the present invention enable the user to create a song of himself or herself on the terminal device, thereby improving functions of the terminal device and meeting an application requirement of the user.Type: ApplicationFiled: October 14, 2013Publication date: May 22, 2014Applicant: Huawei Technologies Co., Ltd.Inventor: Rui Li
-
Patent number: 8731912Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for audible alert tones are disclosed. The methods, systems, and apparatus include actions of determining whether audio input data received after ceasing output of a first instance of an audible alert tone includes voice activity and determining whether to delay a successive instance of the audible alert tone based on determining whether the audio input data includes voice activity.Type: GrantFiled: March 14, 2013Date of Patent: May 20, 2014Assignee: Google Inc.Inventors: Simon Tickner, Peter J Hodgson, Richard Z. Cohen
-
Patent number: 8725498Abstract: A computer-implemented method for digital speech processing, including (1) receiving, at a server computer, digital speech data from a computing device, the digital speech data comprising data points sampled at respective time points; (2) computing, by the server computer, a tonal feature of the digital speech data, the tonal feature comprising information encoding fundamental frequencies at the respective time points; (3) computing, by the server computer, a logarithm of the tonal feature at the respective time points; and (4) processing, by the server computer, the logarithm of the tonal feature based on a characterization of the digital speech data at the respective time points.Type: GrantFiled: July 24, 2012Date of Patent: May 13, 2014Assignee: Google Inc.Inventors: Yun-hsuan Sung, Meihong Wang, Xin Lei
-
Patent number: 8725506Abstract: A speech processing engine is provided that in some embodiments, employs Kalman filtering with a particular speaker's glottal information to clean up an audio speech signal for more efficient automatic speech recognition.Type: GrantFiled: June 30, 2010Date of Patent: May 13, 2014Assignee: Intel CorporationInventors: Willem M. Beltman, Matias Zanartu, Arijit Raychowdhury, Anand P. Rangarajan, Michael E. Deisher
-
Patent number: 8719019Abstract: Speaker identification techniques are described. In one or more implementations, sample data is received at a computing device of one or more user utterances captured using a microphone. The sample data is processed by the computing device to identify a speaker of the one or more user utterances. The processing involving use of a feature set that includes features obtained using a filterbank having filters that space linearly at higher frequencies and logarithmically at lower frequencies, respectively, features that model the speaker's vocal tract transfer function, and features that indicate a vibration rate of vocal folds of the speaker of the sample data.Type: GrantFiled: April 25, 2011Date of Patent: May 6, 2014Assignee: Microsoft CorporationInventors: Hoang T. Do, Ivan J. Tashev, Alejandro Acero, Jason S. Flaks, Robert N. Heitkamp, Molly R. Suver
-
Patent number: 8712765Abstract: A parameter decoding apparatus includes a prediction residue decoder that finds a quantized prediction residue based on encoded information included in a current frame subject to decoding and a moving-average predictor produces a predicted parameter by multiplying a predictive coefficient with a past quantized prediction residue. An adder decodes a parameter by adding the quantized prediction residue and the predicted parameter, wherein the prediction residue decoder, when the current frame is erased, finds a current-frame quantized prediction residue from a weighted linear sum of a parameter decoded in the past and a future-frame quantized prediction residue.Type: GrantFiled: May 17, 2013Date of Patent: April 29, 2014Assignee: Panasonic CorporationInventor: Hiroyuki Ehara
-
Patent number: 8712768Abstract: A method, device, system, and computer program product expand narrowband speech signals to wideband speech signals. The method includes determining signal type information from a signal, obtaining characteristics for forming an upper band signal using the determined signal type information, determining signal noise information, using the determined signal noise information to modify the obtained characteristics for forming the upper band signal, and forming the upper band signal using the modified characteristics.Type: GrantFiled: May 25, 2004Date of Patent: April 29, 2014Assignee: Nokia CorporationInventors: Laura Laaksonen, Päivi Valve
-
Publication number: 20140114653Abstract: An apparatus comprising an analysis window definer configured to define at least one analysis window for a first audio signal, wherein the at least one analysis window definer is configured to be dependent on the first audio signal and a pitch estimator configured to determine a first pitch estimate for the first audio signal, wherein the pitch estimator is dependent on the first audio signal sample values within the analysis window.Type: ApplicationFiled: May 6, 2011Publication date: April 24, 2014Applicant: Nokia CorporationInventors: Lasse Juhani Laaksonen, Anssi Sakari Rämö, Adriana Vasilache, Mikko Tapio Tammi
-
Patent number: 8706488Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.Type: GrantFiled: February 27, 2013Date of Patent: April 22, 2014Assignee: Nuance Communications, Inc.Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen
-
Patent number: 8700390Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.Type: GrantFiled: October 7, 2013Date of Patent: April 15, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Bing Chen, James H. James
-
Patent number: 8694326Abstract: A communication terminal includes a decoder which decodes an input bitstream received from another communication terminal, to generate an output audio signal and outputs the generated output audio signal to a speaker; and an echo canceller which obtains an input audio signal representing sound captured by a microphone placed in a space to which the speaker outputs the sound, and removes, for respective subbands, an echo component included in the obtained input audio signal and corresponding to the output audio signal, to generate an audio signal for transmission. An encoder codes the audio signal for transmission to generate an output bitstream and transmits the generated output bitstream to another communication terminal; and a control unit controls, for the respective subbands, echo cancellation processing according to a reproduction band of at least one of the output audio signal and the audio signal for transmission.Type: GrantFiled: August 21, 2012Date of Patent: April 8, 2014Assignee: Panasonic CorporationInventors: Shuji Miyasaka, Kosuke Nishio, Ichiro Kawashima
-
Patent number: 8682662Abstract: In accordance with an example embodiment of the invention, there is provided an apparatus for detecting voice activity in an audio signal. The apparatus comprises a first voice activity detector for making a first voice activity detection decision based at least in part on the voice activity of a first audio signal received from a first microphone. The apparatus also comprises a second voice activity detector for making a second voice activity detection decision based at least in part on an estimate of a direction of the first audio signal and an estimate of a direction of a second audio signal received from a second microphone. The apparatus further comprises a classifier for making a third voice activity detection decision based at least in part on the first and second voice activity detection decisions.Type: GrantFiled: August 13, 2012Date of Patent: March 25, 2014Assignee: Nokia CorporationInventors: Riitta Elina Niemistö, Päivi Marianna Valve
-
Publication number: 20140081629Abstract: The quality of encoded signals can be improved by reclassifying AUDIO signals carrying non-speech data as VOICE signals when periodicity parameters of the signal satisfy one or more criteria. In some embodiments, only low or medium bit rate signals are considered for re-classification. The periodicity parameters can include any characteristic or set of characteristics indicative of periodicity. For example, the periodicity parameter may include pitch differences between subframes in the audio signal, a normalized pitch correlation for one or more subframes, an average normalized pitch correlation for the audio signal, or combinations thereof. Audio signals which are re-classified as VOICED signals may be encoded in the time-domain, while audio signals that remain classified as AUDIO signals may be encoded in the frequency-domain.Type: ApplicationFiled: September 13, 2013Publication date: March 20, 2014Inventor: Yang Gao
-
Patent number: 8676572Abstract: A computer-implemented system and method for enhancing audio to individuals participating in a conversation is provided. Audio data for individuals participating in one or more conversations is analyzed. Possible conversational configurations of the individuals are generated based on the audio data, and each possible conversational configuration includes one or more subconfigurations of at least two of the individuals. A probability weight is assigned to each of the subconfigurations and includes a likelihood that the individuals of that subconfiguration are participating in one of the conversations. A probability of each possible conversational configuration is determined by combining the probability weights for the subconfigurations of that possible conversational configuration. The possible conversational configuration with the highest probability is selected as a most probable configuration. The individuals participating in the conversations are determined based on the most probable configuration.Type: GrantFiled: March 14, 2013Date of Patent: March 18, 2014Assignee: Palo Alto Research Center IncorporatedInventors: Paul M. Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison G. Woodruff
-
Patent number: 8670980Abstract: A tone determination device, which determines the tonality of an input signal, is capable of reducing calculation complexity. Therein a frequency conversion unit (101) converts the frequency of an input signal; a downsampling unit (102) carries out shortening processing which shortens the vector series length of the frequency-converted signal; a constancy determination unit (107) determines the constancy of the input signal; depending on the constancy of the input signal, a vector selection unit (104) selects either the vector series of the post-frequency conversion signal or the vector series after the shortening of the vector series length; a correlation analysis unit (105) uses the vector series selected by the vector selection unit (104) to obtain correlations; and a tone determination unit (106) uses the correlations to determine the tonality of the input signal.Type: GrantFiled: October 26, 2010Date of Patent: March 11, 2014Assignee: Panasonic CorporationInventor: Kaoru Satoh
-
Patent number: 8665311Abstract: Improved methods and apparatus for sharing and collaborating around a video source by maintaining and providing to users a list of a plurality of contacts containing both video source device contacts and interactive message source contacts. This allows for collaboration among users by permitting them to communicate with each other around an automatically-shared video source, to interact with automatically shared video sources, and to make decisions based on these shared video sources.Type: GrantFiled: February 17, 2011Date of Patent: March 4, 2014Assignee: vBrick Systems, Inc.Inventors: Erik Herz, Douglas Uhl
-
Patent number: 8654761Abstract: In one embodiment, a method can include: (i) establishing an internet protocol (IP) connection; (ii) forming a buffered version of a plurality of voice frame slices from received audio packets; and (iii) when an erasure is detected, performing a packet loss concealment (PLC) to provide a synthesized speech signal for the erasure, where the PLC can include: (a) identifying first and second pitches from the buffered version of the plurality of voice frame slices; and (b) forming the synthesized speech signal by using the first and second pitches, and more if needed, followed by an overlay-add (OLA).Type: GrantFiled: December 17, 2012Date of Patent: February 18, 2014Assignee: Cisco Technology, Inc.Inventors: Duanpei Wu, Luke K. Surazski
-
Patent number: 8655320Abstract: A voice messaging system includes a transceiver, an indicator, a microphone, and a speaker. The transceiver is operable to receive a message from the Internet, and the indicator is operable to announce that the message has been received. The microphone is operable to receive a verbal request to play the message, and the speaker is operable to play the recorded message in response to receiving the verbal request.Type: GrantFiled: April 14, 2009Date of Patent: February 18, 2014Assignee: CA, Inc.Inventors: Christopher J. Stakutis, Thomas M. Boyle, Steven L. Greenspan
-
Publication number: 20140046658Abstract: An audio classifier for frame based audio signal classification includes a feature extractor configured to determine, for each of a predetermined number of consecutive frames, feature measures representing at least the following features: auto correlation, frame signal energy, inter-frame signal energy variation. A feature measure comparator is configured to compare each determined feature measure to at least one corresponding predetermined feature interval. A frame classifier is configured to calculate, for each feature interval, a fraction measure representing the total number of corresponding feature measures that fall within the feature interval, and to classify the latest of the consecutive frames as speech if each fraction measure lies within a corresponding fraction interval, and as non-speech otherwise.Type: ApplicationFiled: April 28, 2011Publication date: February 13, 2014Applicant: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)Inventors: Volodya Grancharov, Sebastian Näslund
-
Patent number: 8649283Abstract: A method executed by a packet analysis apparatus for analyzing packets including voice packets and non-voice packets includes: capturing packets in a specific session; storing the captured packets in a storage; screening the stored packets to count up a receipt count of voice packets; determining whether packet loss has occurred in the specific session; and determining whether loss packets are voice packets in accordance with received packets adjacent to the loss packets to count up a loss count of voice packets when the packet loss has occurred.Type: GrantFiled: June 14, 2010Date of Patent: February 11, 2014Assignee: Fujitsu LimitedInventors: Sumiyo Okada, Noriyuki Fukuyama, Masanobu Morinaga, Hideaki Miyazaki
-
Publication number: 20140006017Abstract: Arrangements are described that may be used to reduce the intelligibility of speech using masker signals which are obfuscated yet correlated versions of the speech. Other applications of pitch analysis and demodulation are also described. A system may be used to drive an array of loudspeakers to produce a sound field that includes a source component, whose energy is concentrated along a first direction relative to the array, and a masking component that is based on an estimated intensity of the source component in a second direction that is different from the first direction.Type: ApplicationFiled: February 28, 2013Publication date: January 2, 2014Applicant: QUALCOMM IncorporatedInventor: Dipanjan Sen
-
Publication number: 20140006018Abstract: In a voice processing apparatus, a processor is configured to adjust, a fundamental frequency of a first voice signal corresponding to a voice having target voice characteristics to a fundamental frequency of a second voice signal corresponding to a voice having initial voice characteristics different from the target voice characteristics. The processor is further configured to sequentially generate a processed spectrum based on a spectrum of the first voice signal and a spectrum of the second voice signal by: dividing the spectrum of the first voice signal into a plurality of harmonic band components after the fundamental frequency of the first voice signal has been adjusted; allocating each harmonic band component of the first voice signal to each harmonic frequency associated with the fundamental frequency of the second voice signal; and adjusting an envelope and phase of each harmonic band component according to the spectrum of the second voice signal.Type: ApplicationFiled: June 20, 2013Publication date: January 2, 2014Applicant: Yamaha CorporationInventors: Jordi BONADA, Merlijn BLAAUW, Yuji HISAMINATO