Detection Of Presence Or Absence Of Speech Signals (epo) Patents (Class 704/E11.003)
  • Patent number: 11972752
    Abstract: Disclosed is a method for detecting a speech segment, which is performed by a computing device. The method may include: detecting a start point of a speech segment in an audio signal; and detecting an end point of the speech segment based on an offset threshold which is dynamically changed, and the dynamically changed offset threshold may be based on a length of the speech segment.
    Type: Grant
    Filed: November 2, 2022
    Date of Patent: April 30, 2024
    Assignee: ActionPower Corp.
    Inventor: Dongchan Shin
  • Patent number: 11756563
    Abstract: This disclosure describes, in part, techniques for performing multi-path calculations for energy levels on an electronic device. For instance, the electronic device may include a first circuit and a second circuit, where the first circuit uses less power than the second circuit. As such, when operating in a standby mode, the electronic device may use the first circuit to calculate energy levels at the electronic device, such as speech-energy values and ambient-energy values. Additionally, while operating in an active mode, the electronic device may active the second circuit and then use the second circuit to calculate the energy levels at the electronic device. The first circuit and the second circuit can send/receive current energy levels between one another so that the electronic device can continually calculate the energy levels even when the electronic device switches between modes of operation.
    Type: Grant
    Filed: June 3, 2019
    Date of Patent: September 12, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Bhupal Kanaiyalal Dharia, Dibyendu Nandy, Marko Bundalo, Hannan Ma
  • Patent number: 11615786
    Abstract: A system to convert phonemes into phonetics-based words that is implemented in one or more computing systems, in association with a system that provides required inputs is disclosed. Said system comprises a phoneme enhancer, a phoneme sequence buffer, a phoneme sequence to phonetics-based word converter that comprises a sliding window phoneme sequence matcher, a phoneme sequence to phonetics-based word custom data memory, a most frequent phonetics-based word predictive memory, a phoneme similarity matrix, and a phonetics-based word output unit.
    Type: Grant
    Filed: March 5, 2020
    Date of Patent: March 28, 2023
    Inventors: Baljit Singh, Praveen Prakash
  • Patent number: 11373637
    Abstract: A processing system operates in a first power domain and includes a first memory, a memory access circuit, and a first processing circuit. The first memory stores sound data detected by a microphone. The memory access circuit transfers the sound data to a second memory according to a first command, in order to store the sound data as voice data. The first processing circuit outputs a second command according to a human voice detection signal. The second command is for enabling a second processing circuit, in order to determine whether the voice data in the second memory matches a predetermined voice command. One of the first and the second processing circuits outputs the first command. The second processing circuit operates in a second power domain. A power consumption to which the first power domain corresponds is lower than a power consumption to which the second power domain corresponds.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: June 28, 2022
    Assignee: REALTEK SEMICONDUCTOR CORPORATION
    Inventor: Ching-Lung Chen
  • Patent number: 9489933
    Abstract: A resonance strength table is prepared, which stores a relation between a pitch difference and a resonance strength, wherein the pitch difference is a difference between a pitch assigned to the key number of a pressed key and a pitch assigned to each of key numbers of a resonance tone. When a key is pressed, the resonance strength table is referred to, and resonance strengths concerning the key numbers of a resonance tone are determined. Then, note-on events of a resonance tone are produced based on the key numbers and the decided resonance strengths and the produced note-on events are sent to a sound source.
    Type: Grant
    Filed: February 19, 2016
    Date of Patent: November 8, 2016
    Assignee: CASIO COMPUTER CO., LTD.
    Inventor: Naoaki Itoh
  • Patent number: 8639519
    Abstract: In a selective signal encoder, an input signal is first encoded using a core layer encoder to produce a core layer encoded signal. The core layer encoded signal is decoded to produce a reconstructed signal and an error signal is generated as the difference between the reconstructed signal and the input signal. The reconstructed signal is compared to the input signal. One of two or more enhancement layer encoders selected dependent upon the comparison and used to encode the error signal. The core layer encoded signal, the enhancement layer encoded signal and the selection indicator are output to the channel (for transmission or storage, for example).
    Type: Grant
    Filed: April 9, 2008
    Date of Patent: January 28, 2014
    Assignee: Motorola Mobility LLC
    Inventors: James P. Ashley, Jonathan A. Gibbs, Udar Mittal
  • Patent number: 8482410
    Abstract: Method of detecting the operation of a device for transmitting voice signals between two equipment items so that at least one of the equipment items can send a voice signal and the other equipment item can receive this voice signal, the equipment items being linked by a wireless transmission chain, the method comprising, during a voice signal transmission phase, the steps of: detecting on the sending equipment item a presence of a voice signal on transmission, and generating in response a voice presence on transmission signal; detecting on the receiving equipment item a presence or an absence of a voice signal on reception, and generating in response a signal indicating voice on reception; transmitting the signal indicating voice on reception from the receiving equipment item to the sending equipment item; comparing in the sending equipment item the voice presence on transmission signal and the signal indicating voice on reception, and triggering an alarm if the compared signals are not consistent.
    Type: Grant
    Filed: February 15, 2011
    Date of Patent: July 9, 2013
    Inventor: Dominique Retali
  • Patent number: 8380494
    Abstract: The method and system disclosed herein reduces total bandwidth requirement for communication in a voice over Internet protocol application. Sample [101] and convert [102] the analog input audio signal into digital signals and derive sampled frames [103]. Compute spacings of order statistics [104]. Measure the entropy for each of the sampled frames [105]. Set a threshold for entropy [106]. Mark the audio frames as active speech frames or inactive speech frames [107]. Mark an audio frame as an' inactive speech frame when the entropy is greater than the threshold, and mark the audio frame as an active speech frame when the entropy is lesser than the threshold [107]. Transmit only the active speech frames [108].
    Type: Grant
    Filed: January 24, 2007
    Date of Patent: February 19, 2013
    Assignee: P.E.S. Institute of Technology
    Inventors: Muralishankar Rangarao, Vijay Satyanarayana Rao, Venkatesha Prasad Rangarao, Shankar Hebbale Narasimhiah
  • Publication number: 20130013318
    Abstract: As part of a communication session, a wireless source device can transmit audio and video data to a wireless sink device, and the wireless sink device can transmit user input data received at the wireless sink device back to the wireless source device. In this manner, a user of the wireless sink device can control the wireless source device and control the content that is being transmitted from the wireless source device to the wireless sink device. The input data received at the wireless sink device can be a voice command.
    Type: Application
    Filed: January 5, 2012
    Publication date: January 10, 2013
    Applicant: QUALCOMM Incorporated
    Inventors: Xiaolong Huang, Vijayalakshmi R. Raveendran, Xiaodong Wang, Fawad Shaukat
  • Publication number: 20120179459
    Abstract: A method of pre-processing an audio signal transmitted to a user terminal via a communication network and an apparatus using the method are provided. The method of pre-processing the audio signal may prevent deterioration of a sound quality of the audio signal transmitted to the user terminal by pre-processing the audio signal, and by enabling a codec module, encoding the audio signal, to determine the audio signal as a speech signal. Also, the method of pre-processing the audio signal may improve a probability that the codec module may determine a corresponding audio signal as a speech when the audio signal is transmitted via the communication network by pre-processing the audio signal using a speech codec.
    Type: Application
    Filed: March 21, 2012
    Publication date: July 12, 2012
    Applicant: REALNETWORKS, INC.
    Inventors: Jae Woong Jeong, Seop Hyeong Park, Jong Kyu Ryu
  • Patent number: 8199928
    Abstract: An apparatus processes an acoustic input signal to provide an output signal with reduced noise. The apparatus weights the input signal based on a frequency-dependent weighting function. A frequency-dependent threshold function bounds the weighting function from below.
    Type: Grant
    Filed: May 9, 2008
    Date of Patent: June 12, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Gerhard Uwe Schmidt, Raymond Brückner, Markus Buck, Ange Tchinda-Pockem, Mohamed Krini
  • Publication number: 20120065966
    Abstract: A voice activity detection method and apparatus, and an electronic device are provided. The method includes: obtaining a time domain parameter and a frequency domain parameter from an audio frame; obtaining a first distance between the time domain parameter and a long-term slip mean of the time domain parameter in a history background noise frame, and obtaining a second distance between the frequency domain parameter and a long-term slip mean of the frequency domain parameter in the history background noise frame; and judging whether the audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance and a set of decision inequalities based on the first distance and the second distance. The above technical solutions enable the judgment criterion to have an adaptive adjustment capability, thus improving the performance of the voice activity detection.
    Type: Application
    Filed: November 30, 2011
    Publication date: March 15, 2012
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventor: Zhe Wang
  • Publication number: 20110246185
    Abstract: A frame extracting means 71 extracts frames from sample data as voice data in which whether each frame is an active voice frame or a non-active voice frame is already known. A feature quantity calculating means 72 calculates multiple feature quantities of each of the frames. A feature quantity integrating means 73 calculates an integrated feature quantity of the multiple feature quantities. A judgment means 74 judges whether each of the frames is an active voice frame or a non-active voice frame. An erroneous feature quantity calculation value calculating means 75 obtains a first erroneous feature quantity calculation value and a second erroneous feature quantity calculation value by executing prescribed calculations. A weight updating means 76 updates weights used for weighting so that the rate between the first erroneous feature quantity calculation value and the second erroneous feature quantity calculation value approaches a prescribed value.
    Type: Application
    Filed: December 7, 2009
    Publication date: October 6, 2011
    Applicant: NEC CORPORATION
    Inventors: Takayuki Arakawa, Masanori Tsujikawa
  • Publication number: 20110208514
    Abstract: A data embedding device for embedding data in a speech code obtained by encoding a speech in accordance with a speech encoding method based on a voice generation process of a human being, includes an embedding judgment unit, every speech code, judging whether or not data should be embedded in the speech code, and an embedding unit embedding data in two or more parameter codes of a plurality of parameter codes constituting the speech code for which it is judged by the embedding judgment unit that the data should be embedded.
    Type: Application
    Filed: May 3, 2011
    Publication date: August 25, 2011
    Applicant: FUJITSU LIMITED
    Inventors: Yoshiteru Tsuchinaga, Yasuji Ota, Masanao Suzuki, Masakiyo Tanaka, Joe Mizuno
  • Patent number: 8005672
    Abstract: An audio processing system includes a speech detector that receives and processes an audio input signal to determine if the input signal includes components indicative of speech, and provides a control signal indicative of whether or not the audio input signal includes speech. A speech processing device receives the audio input signal and processes the audio input signal to improve its quality if the control signal indicates that the audio input signal includes speech.
    Type: Grant
    Filed: October 11, 2005
    Date of Patent: August 23, 2011
    Assignee: Trident Microsystems (Far East) Ltd.
    Inventors: Matthias Vierthaler, Florian Pfister, Dieter Luecking, Stefan Mueller
  • Patent number: 7966179
    Abstract: A method and apparatus for distinguishing a voice region from a non-voice region in an environment where various types of noise and voice are mixed together are provided. The method includes the steps of converting an input voice signal into a frequency domain signal by preprocessing the input voice signal, performing sigmoid compression on the converted signal, transforming a spectrum vector generated by the sigmoid compression into a voice detection parameter in scalar form, and detecting the voice region using the parameter.
    Type: Grant
    Filed: January 27, 2006
    Date of Patent: June 21, 2011
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kwang-cheol Oh, Ki-young Park
  • Publication number: 20110103370
    Abstract: A hung call system includes a memory storing samples of voice data from packets for a VoIP call. A voice activity detector detects whether the stored voice data includes a voice from one or more parties to the call. A processing circuit determines whether the voice activity detector detects the voice, and the processing circuit facilitates release of the call if the voice activity detector does not detect the voice for a predetermined period of time.
    Type: Application
    Filed: October 29, 2009
    Publication date: May 5, 2011
    Applicant: General Instruments Corporation
    Inventors: Jacob Igval, Christopher J. Cotignola
  • Publication number: 20110004468
    Abstract: A hearing aid for improving diminished hearing caused by reduced temporal resolution includes: a speech input unit (201) which receives a speech signal from outside; a speech analysis unit (202) which detects a sound segment and a segment acoustically regarded as soundless from the speech signal received by the speech input unit and detects a consonant segment and a vowel segment within the detected sound segment; and a signal processing unit (204) which temporally increments the consonant segment detected by the speech analysis unit (204) and temporally decrements at least one of the vowel segment and the segment acoustically regarded as soundless detected by the speech analysis unit (204).
    Type: Application
    Filed: January 28, 2010
    Publication date: January 6, 2011
    Inventors: Kazue Fusakawa, Gempo Ito
  • Publication number: 20100280824
    Abstract: Systems and methods to reduce the negative impact of wind on an electronic system include use of a first detector that receives a first signal and a second detector that receives a second signal. A voice activity detector (VAD) coupled to the first detector generates a VAD signal when the first signal corresponds to voiced speech. A wind detector coupled to the second detector correlates signals received at the second detector and derives from the correlation wind metrics that characterize wind noise that is acoustic disturbance corresponding to at least one of air flow and air pressure in the second detector. The wind detector controls a configuration of the second detector according to the wind metrics. The wind detector uses the wind metrics to dynamically control mixing of the first signal and the second signal to generate an output signal for transmission.
    Type: Application
    Filed: May 3, 2010
    Publication date: November 4, 2010
    Inventors: Nicolas Petit, Gregory Burnett, Michael Goertz
  • Publication number: 20100268532
    Abstract: A system for voice detection includes a feature value calculation unit that calculates a feature value from an input signal sliced on a per frame basis, a provisional voice/non-voice decision unit that provisionally decides a voiced interval and a non-voiced interval from the feature value calculated on a per frame basis, and a voice/non-voice decision unit that determines a voiced interval duration threshold value or a non-voiced interval duration threshold value, using a ratio of the feature value found on a per frame basis to a threshold value for the feature value and that re-decides the voiced interval and the non-voiced interval, using the voiced interval duration threshold value determined and the non-voiced interval duration threshold value determined.
    Type: Application
    Filed: November 26, 2008
    Publication date: October 21, 2010
    Inventors: Takayuki Arakawa, Masanori Tsujikawa
  • Patent number: 7809554
    Abstract: An apparatus, method, and medium for detecting a voiced sound and an unvoiced sound. The apparatus includes a blocking unit for dividing an input signal into block units; a parameter calculator for calculating a first parameter to determine the voiced sound and a second parameter to determine the unvoiced sound by using a slope and spectral flatness measure (SFM) of a mel-scaled filter bank spectrum of an input signal existing in a block; and a determiner for determining a voiced sound zone and an unvoiced sound zone in the block by comparing the first and second parameters to predetermined threshold values.
    Type: Grant
    Filed: February 7, 2005
    Date of Patent: October 5, 2010
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Kwangcheol Oh
  • Publication number: 20100217584
    Abstract: A speech analysis device which accurately analyzes an aperiodic component included in speech in a practical environment where there is background noise includes: a frequency band division unit which divides, into bandpass signals each associated with a corresponding one of frequency bands, an input signal representing a mixed sound of background noise and speech; a noise interval identification unit which identifies a noise interval and a speech interval of the input signal; an SNR calculation unit which calculates an SN ratio; a correlation function calculation unit which calculates an autocorrelation function of each bandpass signal; a correction amount determination unit which determines a correction amount for an aperiodic component ratio, based on the calculated SN ratio; and an aperiodic component ratio calculation unit which calculates, for each frequency band, an aperiodic component ratio of the aperiodic component, based on the determined correction amount and the calculated autocorrelation function.
    Type: Application
    Filed: May 4, 2010
    Publication date: August 26, 2010
    Inventors: Yoshifumi Hirose, Takahiro Kamai
  • Publication number: 20100174535
    Abstract: A method of filtering a speech signal for speech encoding in a communications network, includes determining a cut off frequency for a filter, wherein a component of the speech signal in a frequency range less than the cut off frequency is to be attenuated by the filter; receiving the speech signal at the filter; determining at least one parameter of the received speech signal, the at least one parameter providing an indication of the energy of the component of the received speech signal that is to be attenuated; and adjusting the cut off frequency in dependence on the at least one parameter, thereby adjusting the frequency range to be attenuated.
    Type: Application
    Filed: June 19, 2009
    Publication date: July 8, 2010
    Applicant: Skype Limited
    Inventors: Koen Bernard Vos, Stefan Strômmer
  • Publication number: 20100145689
    Abstract: An audio signal is received that might include keyboard noise and speech. The audio signal is digitized and transformed from a time domain to a frequency domain. The transformed audio is analyzed to determine whether there is likelihood that keystroke noise is present. If it is determined there is high likelihood that the audio signal contains keystroke noise, a determination is made as to whether a keyboard event occurred around the time of the likely keystroke noise. If it is determined that a keyboard event occurred around the time of the likely keystroke noise, a determination is made as to whether speech is present in the audio signal around the time of the likely keystroke noise. If no speech is present, the keystroke noise is suppressed in the audio signal. If speech is detected in the audio signal or if the keystroke noise abates, the suppression gain is removed from the audio signal.
    Type: Application
    Filed: December 5, 2008
    Publication date: June 10, 2010
    Applicant: Microsoft Corporation
    Inventors: Qin Li, Michael Lewis Seltzer, Chao He
  • Publication number: 20100106491
    Abstract: The present invention is a system and method that improves upon voice activity detection by packetizing actual noise signals, typically background noise. In accordance with the present invention an access network receives an input voice signal (including noise) and converts the input voice signal into a packetized voice signal. The packetized voice signal is transmitted via a network to an egress network. The egress network receives the packetized voice signal, converts the packetized voice signal into an output voice signal, and outputs the output voice signal. The egress network also extracts and stores noise packets from the received packetized voice signal and converts the packetized noise signal into an output noise signal. When the access network ceases to receive the input voice signal while the call is still ongoing, the access network instructs the egress network to continually output the output noise signal.
    Type: Application
    Filed: December 28, 2009
    Publication date: April 29, 2010
    Applicant: AT&T Corp.
    Inventors: James H. James, Joshua Hal Rosenbluth
  • Publication number: 20100098064
    Abstract: A method and apparatus for dynamically enabling the activation and deactivation of comfort noise over a VoIP media path or channel are disclosed. The present method detects all sound levels in the media path and only activates the comfort noise in the absence of sound and when the background noise level or the telephone line noise level is low rather than only in the absence of speech.
    Type: Application
    Filed: December 26, 2009
    Publication date: April 22, 2010
    Inventors: MARIAN CROAK, Hossein Eslambolchi
  • Publication number: 20100070266
    Abstract: Systems and methods for generating performance metrics to monitor and/or enhance the performance of telephone-intensive personnel are disclosed. The method generally includes detecting voice activity on a receive and/or a transmit channel in a communications system, outputting voicing decision outputs based on the detecting, storing the voicing decision outputs over a period of time to memory, and generating voice activity performance metrics based on the voicing decision output stored in the memory. The generating may include generating a running average ratio of duration of voice activity on the transmit channel to duration of voice activity on the receive channel (talk-listen ratio) over a certain period of time for one or more agents. The talk-listen ratio may be compared to a target ratio.
    Type: Application
    Filed: September 26, 2003
    Publication date: March 18, 2010
    Applicant: Plantronics, Inc., A DELAWARE CORPORATION
    Inventors: Iain J. McNeill, J. Stephen Graham
  • Publication number: 20090125301
    Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.
    Type: Application
    Filed: November 3, 2008
    Publication date: May 14, 2009
    Applicant: Melodis Inc.
    Inventors: Aaron Master, Seyed Majid Emami
  • Publication number: 20090089053
    Abstract: Voice activity detection using multiple microphones can be based on a relationship between an energy at each of a speech reference microphone and a noise reference microphone. The energy output from each of the speech reference microphone and the noise reference microphone can be determined. A speech to noise energy ratio can be determined and compared to a predetermined voice activity threshold. In another embodiment, the absolute value of the autocorrelation of the speech and noise reference signals are determined and a ratio based on autocorrelation values is determined. Ratios that exceed the predetermined threshold can indicate the presence of a voice signal. The speech and noise energies or autocorrelations can be determined using a weighted average or over a discrete frame size.
    Type: Application
    Filed: September 28, 2007
    Publication date: April 2, 2009
    Applicant: QUALCOMM INCORPORATED
    Inventors: Song Wang, Samir Kumar Gupta, Eddie L. T. Choy
  • Publication number: 20090055170
    Abstract: A sound source signal from a target sound source is allowed to be separated from a mixed sound which consists of sound source signals emitted from a plurality of sound sources without being affected by uneven sensitivity of microphone elements. A beamformer section 3 of a source separation device 1 performs beamforming processing for attenuating sound source signals arriving from directions symmetrical with respect to a perpendicular line to a straight line connecting two microphones 10 and 11 respectively by multiplying output signals from the microphones 10 and 11 after spectrum analysis by weighted coefficients which are complex conjugate to each other. Power computation sections 40 and 41 compute power spectrum information, and target sound spectrum extraction sections 50 and 51 extract spectrum information of a target sound source based on a difference between the power spectrum information.
    Type: Application
    Filed: August 11, 2006
    Publication date: February 26, 2009
    Inventor: Katsumasa Nagahama
  • Publication number: 20090030690
    Abstract: A speech analysis apparatus analyzing prosodic characteristics of speech information and outputting a prosodic discrimination result includes an input unit inputting speech information, an acoustic analysis unit calculating relative pitch variation and a discrimination unit performing speech discrimination processing, in which the acoustic analysis unit calculates a current template relative pitch difference, determining whether a difference absolute value between the current template relative pitch difference and a previous template relative pitch difference is equal to or less than a predetermined threshold or not, when the value is not less than the threshold, calculating an adjacent relative pitch difference, and when the adjacent relative pitch difference is equal to or less than a previously set margin value, executing correction processing of adding or subtracting an octave of the current template relative pitch difference to calculate the relative pitch variation by applying the relative pitch differe
    Type: Application
    Filed: July 21, 2008
    Publication date: January 29, 2009
    Inventor: Keiichi YAMADA
  • Publication number: 20080306735
    Abstract: Included are systems and methods for indicating presence of data. At least one embodiment of a method includes receiving communications data associated with a communication session and determining at least one point of audio silence in the communications session. Some embodiments include creating tagging data configured to indicate the at least one point of audio silence in the communications session.
    Type: Application
    Filed: March 31, 2008
    Publication date: December 11, 2008
    Inventors: KENNETH RICHARD BRODHAGEN, Mark Alan Goodall, Damian Smith, Jamie Richard Williams
  • Publication number: 20080215325
    Abstract: An apparatus, method and program for dividing a conversational dialog into utterance. The apparatus includes a computer processor; a word database for storing spellings and pronunciations of words; a grammar database for storing syntactic rules on words; a pause detecting section which detects a pause location in a channel making a main speech among conversational dialogs inputted in at least two channels; an acknowledgement detecting section which detects an acknowledgement location in a channel not making the main speech; a boundary-candidate extracting section which extracts boundary candidates in the main speech, by extracting pauses existing within a predetermined range before and after a base point that is the acknowledgement location; and a recognizing unit which outputs a word string of the main speech segmented by one of the extracted boundary candidates after dividing the segmented speech into optimal utterance in reference to the word database and grammar database.
    Type: Application
    Filed: December 27, 2007
    Publication date: September 4, 2008
    Inventors: Hiroshi Horii, Hideki Tai, Gaku Yamamoto
  • Publication number: 20080167863
    Abstract: The present invention relates to an apparatus and method of improving intelligibility of a voice signal. A method of improving intelligibility of a voice signal according to an embodiment of the present invention includes analyzing a background noise signal on a call receiving side, classifying a received voice signal into a silence signal, an unvoiced sound signal, and a voiced sound signal, and intensifying the classified unvoiced sound signal and voiced sound signal on the basis of the analyzed background noise signal on the call receiving side.
    Type: Application
    Filed: November 16, 2007
    Publication date: July 10, 2008
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Chang-kyu Choi, Kwang-il Hwang, Sun-gi Hong, Young-hun Sung, Yeun-bae Kim, Yong Kim, Sang-hoon Lee, Hong Jeong
  • Publication number: 20080091422
    Abstract: A speech recognition method includes inputting an audio signal including a speech signal and a non-speech signal, discriminating a signal mode of the audio signal, processing the audio signal according to a discrimination result of the discriminating to separate substantially the speech signal from the audio signal, and subjecting the separated speech signal to speech recognition.
    Type: Application
    Filed: December 6, 2007
    Publication date: April 17, 2008
    Inventors: Koichi Yamamoto, Yasuyuki Masai, Makoto Yajima, Kohei Momosaki, Kazuhiko Abe, Munehiko Sasajima
  • Publication number: 20080086308
    Abstract: An audio conversation apparatus (1) comprises: an assignment section (15) for individually assigning units of spatial information which are different from each other, either to parties-to-talk-with each belonging to one of a plurality of predetermined groups, respectively, or to the plurality of predetermined groups, respectively; and a localization section (16) for localizing, in accordance with the units of spatial information assigned by the assignment section (15), audio data transmitted from outside, and one of a reproduction section connected to the audio conversation apparatus and a reproduction section (17) included in the audio conversation apparatus outputs an audio in accordance with the audio data having been localized by the localization section (16).
    Type: Application
    Filed: November 22, 2005
    Publication date: April 10, 2008
    Inventors: Tsuyoshi Kindo, Noboru Katta, Takashi Akita
  • Publication number: 20080052079
    Abstract: An electronic appliance includes a speaker which outputs a first sound wave based on a first voice signal generated from the electronic appliance, and a microphone to detect a second sound wave on which a sound wave generated for control of the electronic appliance is superimposed to output a second voice signal. A first waveform generator generates a first waveform signal based on the first voice signal, and a second waveform generator generates a second waveform signal based on the second voice signal. A waveform shaping unit outputs a third waveform signal in which the first waveform signal is enlarged in a time axis direction, and a subtracter subtracts the third waveform signal from the second waveform signal.
    Type: Application
    Filed: August 24, 2007
    Publication date: February 28, 2008
    Applicant: VICTOR COMPANY OF JAPAN, LIMITED
    Inventors: Hirokazu Ohguri, Masahiro Kitaura
  • Publication number: 20080046233
    Abstract: A technique for concealing the effect of a lost frame in a series of frames representing an encoded audio signal in a sub-band predictive coding system is provided. In accordance with the technique, one or more received frames in the series of frames are decoded to generate a full-band output audio signal, wherein the full-band output audio signal comprises a combination of at least a first sub-band decoded audio signal and a second sub-band decoded audio signal. The full-band output audio signal corresponding to the one or more received frames is stored. Then, a full-band output audio signal corresponding to the lost frame is synthesized, wherein synthesizing the full-band output audio signal corresponding to the lost frame comprises performing waveform extrapolation based on the stored full-band output audio signal corresponding to the one or more received frames.
    Type: Application
    Filed: August 15, 2007
    Publication date: February 21, 2008
    Applicant: BROADCOM CORPORATION
    Inventors: Juin-Hwey Chen, Jes Thyssen, Robert W. Zopf
  • Publication number: 20080040109
    Abstract: A Yule-Walker based, low-complexity voice activity detector (VAD) is disclosed. An input signal is typically noisy speech (i.e., corrupted with, for example, babble noise). In one embodiment, a first initialization stage of the VAD computes an occurrence of a silent period within the input signal and the AR parameters. The VAD could accordingly compute a tentative adaptive threshold and output hypothesis H1 (which means speech is present) during this stage. During the second initialization stage, the VAD generally builds a database of associated values and computes the adaptive threshold accordingly. The second initialization stage could also output tentative VAD decisions based on the tentative threshold computed in the first initialization stage. Finally, the VAD periodically retrains or updates AR parameters, threshold values and/or the database and outputs VAD decisions accordingly.
    Type: Application
    Filed: August 3, 2007
    Publication date: February 14, 2008
    Applicant: STMICROELECTRONICS ASIA PACIFIC PTE LTD
    Inventors: Karthik Muralidhar, Anoop Krishna
  • Publication number: 20080027716
    Abstract: Disclosed configurations include systems, methods, and apparatus arranged to generate a sequence of spectral tilt values that is based on inactive frames of a speech signal. For each of a plurality of inactive frames of the speech signal, a transmit decision is made according to a change calculated among at least two corresponding values of the sequence. The outcome of the transmit decision determines whether a silence description is transmitted for the corresponding inactive frame.
    Type: Application
    Filed: July 30, 2007
    Publication date: January 31, 2008
    Inventors: Vivek Rajendran, Ananthapadmanabhan A. Kandhadai
  • Publication number: 20080027717
    Abstract: Speech encoders and methods of speech encoding are disclosed that encode inactive frames at different rates. Apparatus and methods for processing an encoded speech signal are disclosed that calculate a decoded frame based on a description of a spectral envelope over a first frequency band and the description of a spectral envelope over a second frequency band, in which the description for the first frequency band is based on information from a corresponding encoded frame and the description for the second frequency band is based on information from at least one preceding encoded frame. Calculation of the decoded frame may also be based on a description of temporal information for the second frequency band that is based on information from at least one preceding encoded frame.
    Type: Application
    Filed: July 30, 2007
    Publication date: January 31, 2008
    Inventors: Vivek Rajendran, Ananthapadmanabhan A. Kandhadai