Voiced-unvoiced Decision (epo) Patents (Class 704/E11.007)
-
Publication number: 20130231928Abstract: A sound synthesizing apparatus includes a waveform storing section which stores a plurality of unit waveforms extracted from different positions, on a time axis, of a sound waveform indicating a voiced sound, and a waveform generating section which generates a synthesized waveform by arranging the plurality of unit waveforms on the time axis.Type: ApplicationFiled: August 30, 2012Publication date: September 5, 2013Applicant: Yamaha CorporationInventor: Hiraku KAYAMA
-
Publication number: 20130041658Abstract: A system and method may be configured to process an audio signal. The system and method may track pitch, chirp rate, and/or harmonic envelope across the audio signal, may reconstruct sound represented in the audio signal, and/or may segment or classify the audio signal. A transform may be performed on the audio signal to place the audio signal in a frequency chirp domain that enhances the sound parameter tracking, reconstruction, and/or classification.Type: ApplicationFiled: August 8, 2011Publication date: February 14, 2013Applicant: The Intellisis CorporationInventors: David C. BRADLEY, Daniel S. GOLDIN, Robert N. HILTON, Nicholas K. FISHER, Rodney GATEAU, Derrick R. ROOS, Eric WIEWIORA
-
Publication number: 20130024192Abstract: Disclosed is an information display system provided with: a signal analyzing unit which analyzes the audio signals obtained from a predetermined location and which generates ambient sound information regarding the sound generated at the predetermined location; and an ambient expression selection unit which selects an ambient expression which expresses the content of what a person is feeling from the sound generated at the predetermined location on the basis of the ambient sound information.Type: ApplicationFiled: March 28, 2011Publication date: January 24, 2013Applicant: NEC CORPORATIONInventors: Toshiyuki Nomura, Yuzo Senda, Kyota Higa, Takayuki Arakawa, Yasuyuki Mitsui
-
Publication number: 20120278068Abstract: A voice activity detection method and apparatus, and an electronic device are provided. The method includes: obtaining a time domain parameter and a frequency domain parameter from an audio frame; obtaining a first distance between the time domain parameter and a long-term slip mean of the time domain parameter in a history background noise frame, and obtaining a second distance between the frequency domain parameter and a long-term slip mean of the frequency domain parameter in the history background noise frame; and determining whether the audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance and a set of decision inequalities based on the first distance and the second distance. The above technical solutions enable the determination criterion to have an adaptive adjustment capability, thus improving the performance of the voice activity detection.Type: ApplicationFiled: July 11, 2012Publication date: November 1, 2012Applicant: Huawei Technologies Co., Ltd.Inventor: Zhe Wang
-
Publication number: 20120253796Abstract: A sound is picked up by a microphone. A speech waveform signal is generated based on the picked up sound. A speech segment or a non-speech segment is detected based on the speech waveform signal. The speech segment corresponds to a voice input period during which a voice is input. The non-speech segment corresponds to a non-voice input period during which no voice is input. A determination signal is generated that indicates whether the picked up sound is the speech segment or the non-speech segment. A detected state of the speech segment is indicated based on the determination signal.Type: ApplicationFiled: March 29, 2012Publication date: October 4, 2012Applicant: JVC KENWOOD Corporation a corporation of JapanInventor: Taichi MAJIMA
-
Publication number: 20120239389Abstract: Disclosed is an audio signal processing method comprising the steps of: receiving an audio signal containing current frame data; generating a first temporary output signal for the current frame when an error occurs in the current frame data, by carrying out frame error concealment with respect to the current frame data a random codebook; generating a parameter by carrying out one or more of short-term prediction, long-term prediction and a fixed codebook search based on the first temporary output signal; and memory updating the parameter for the next frame; wherein the parameter comprises one or more of pitch gain, pitch delay, fixed codebook gain and a fixed codebook.Type: ApplicationFiled: November 24, 2010Publication date: September 20, 2012Applicant: LG ELECTRONICS INC.Inventors: Hye Jeong Jeon, Dae Hwan Kim, Hong Goo Kang, Min Ki Lee, Byung Suk Lee, Gyu Hyeok Jeong
-
Publication number: 20120158401Abstract: In one embodiment, a music detection (MD) module accumulates sets of one or more frames and performs FFT processing on each set to recover a set of coefficients, each corresponding to a different frequency k. For each frame, the module identifies candidate musical tones by searching for peak values in the set of coefficients. If a coefficient corresponds to a peak, then a variable TONE[k] corresponding to the coefficient is set equal to one. Otherwise, the variable is set equal to zero. For each variable TONE[k] having a value of one, a corresponding accumulator A[k] is increased. Candidate musical tones that are short in duration are filtered out by comparing each accumulator A[k] to a minimum duration threshold. A determination is made as to whether or not music is present based on a number of candidate musical tones and a sum of candidate musical tone durations using a state machine.Type: ApplicationFiled: August 9, 2011Publication date: June 21, 2012Applicant: LSI CorporationInventors: Ivan Leonidovich Mazurenko, Dmitry Nikolaevich Babin, Alexander Markovic, Denis Vladimirovich Parkhomenko, Alexander Alexandrovich Petyushko
-
Publication number: 20120101813Abstract: A mixed time-domain/frequency-domain coding device and method for coding an input sound signal, wherein a time-domain excitation contribution is calculated in response to the input sound signal. A cut-off frequency for the time-domain excitation contribution is also calculated in response to the input sound signal, and a frequency extent of the time-domain excitation contribution is adjusted in relation to this cut-off frequency. Following calculation of a frequency-domain excitation contribution in response to the input sound signal, the adjusted time-domain excitation contribution and the frequency-domain excitation contribution are added to form a mixed time-domain/frequency-domain excitation constituting a coded version of the input sound signal. In the calculation of the time-domain excitation contribution, the input sound signal may be processed in successive frames of the input sound signal and a number of sub-frames to be used in a current frame may be calculated.Type: ApplicationFiled: October 25, 2011Publication date: April 26, 2012Applicant: VOICEAGE CORPORATIONInventors: Tommy Vaillancourt, Milan Jelinek
-
Publication number: 20120089391Abstract: Methods for estimating speech model parameters are disclosed. For pulsed parameter estimation, a speech signal is divided into multiple frequency bands or channels using bandpass filters. Channel processing reduces sensitivity to pole magnitudes and frequencies and reduces impulse response time duration to improve pulse location and strength estimation performance. These methods are useful for high quality speech coding and reproduction at various bit rates for applications such as satellite and cellular voice communication.Type: ApplicationFiled: October 7, 2011Publication date: April 12, 2012Applicant: Digital Voice Systems, Inc.Inventor: Daniel W. Griffin
-
Publication number: 20120084080Abstract: The present invention provides a novel system and method for monitoring the audio signals, analyze selected audio signal components, compare the results of analysis with a threshold value, and enable or disable noise reduction capability of a communication device.Type: ApplicationFiled: April 8, 2011Publication date: April 5, 2012Applicant: ALON KONCHITSKYInventors: Alon Konchitsky, Alberto D Berstein, Sandeep Kulakcherla
-
Publication number: 20120084082Abstract: Encoding audio signals with selecting an encoding mode for encoding the signal categorizing the signal into active segments having voice activity and non-active segments having substantially no voice activity by using categorization parameters depending on the selected encoding mode and encoding at least the active segments using the selected encoding mode.Type: ApplicationFiled: September 29, 2011Publication date: April 5, 2012Applicant: Nokia CorporationInventors: Kari Järvinen, Pasi Ojala, Ari Lakaniemi
-
Publication number: 20110282658Abstract: The present invention relates to co-channel audio source separation. In one embodiment a first frequency-related representation of plural regions of the acoustic signal is prepared over time, and a two-dimensional transform of plural two-dimensional localized regions of the first frequency-related representation, each less than an entire frequency range of the first frequency related representation, is obtained to provide a two-dimensional compressed frequency-related representation with respect to each two dimensional localized region. For each of the plural regions, at least one pitch is identified. The pitch from the plural regions is processed to provide multiple pitch estimates over time. In another embodiment, a mixed acoustic signal is processed by localizing multiple time-frequency regions of a spectrogram of the mixed acoustic signal to obtain one or more acoustic properties.Type: ApplicationFiled: September 3, 2010Publication date: November 17, 2011Applicant: Massachusetts Institute of TechnologyInventors: Tianyu Wang, Thomas R. Quatieri, JR.
-
Publication number: 20110264447Abstract: Implementations and applications are disclosed for detection of a transition in a voice activity state of an audio signal, based on a change in energy that is consistent in time across a range of frequencies of the signal.Type: ApplicationFiled: April 22, 2011Publication date: October 27, 2011Applicant: QUALCOMM IncorporatedInventors: Erik Visser, Ian Ernan Liu, Jongwon Shin
-
Publication number: 20110246189Abstract: An audio quality feedback system and method is provided. The system receives audio from a client via a communication device such as a microphone, The audio quality feedback system compares the received audio to one or more parameters regarding the quality of the feedback. The parameters include, for example, clipping, periods of silence, signal to noise ratios. Based on the comparison, feedback is generated to allow adjustment of the communication device or use of the communication device to improve the quality of the audio.Type: ApplicationFiled: March 21, 2011Publication date: October 6, 2011Applicant: nVoq IncorporatedInventors: Peter Fox, Michael Clark, Jarek Foltynski
-
Publication number: 20110218801Abstract: The invention relates to a method for outputting a speech signal. Speech signal frames are received and are used in a predetermined sequence in order to produce a speech signal to be output. If one speech signal frame to be received is not received, then a substitute speech signal frame is used in its place, which is produced as a function of a previously received speech signal frame. According to the invention, in the situation in which the previously received speech signal frame has a voiceless speech signal, the substitute speech signal frame is produced by means of a noise signal.Type: ApplicationFiled: September 28, 2009Publication date: September 8, 2011Applicant: ROBERT BOSCH GMBHInventors: Peter Vary, Frank Mertz
-
Publication number: 20110196675Abstract: A VAD/SS system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.Type: ApplicationFiled: March 17, 2011Publication date: August 11, 2011Applicant: AT&T CORPORATIONInventors: Bing Chen, James H. James
-
Publication number: 20110153317Abstract: An apparatus for wireless communications includes a processing system. The processing system is configured to receive an input sound stream of a user, split the input sound stream into a plurality of frames, classify each of the frames as one selected from the group consisting of a non-speech frame and a speech frame, determine a pitch of each of the frames in a subset of the speech frames, and identify a gender of the user from the determined pitch. To determine the pitch, the processing system is configured to filter the speech frames to compute an error signal, compute an autocorrelation of the error signal, find a maximum autocorrelation value, and set the pitch to an index of the maximum autocorrelation value.Type: ApplicationFiled: December 23, 2009Publication date: June 23, 2011Applicant: QUALCOMM INCORPORATEDInventors: Yinian Mao, Gene Marsh
-
Publication number: 20110153318Abstract: There is provided a method or a device for extending a bandwidth of a first band speech signal to generate a second band speech signal wider than the first band speech signal and including the first band speech signal. The method comprises receiving a segment of the first band speech signal having a low cut off frequency and a high cut off frequency; determining the high cut off frequency of the segment; determining whether the segment is voiced or unvoiced; if the segment is voiced, applying a first bandwidth extension function to the segment to generate a first bandwidth extension in high frequencies; if the segment is unvoiced, applying a second bandwidth extension function to the segment to generate a second bandwidth extension in the high frequencies; using the first bandwidth extension and the second bandwidth extension to extend the first band speech signal beyond the high cut off frequency.Type: ApplicationFiled: March 15, 2010Publication date: June 23, 2011Applicant: MINDSPEED TECHNOLOGIES, INC.Inventors: Norbert Rossello, Fabien Klein
-
Publication number: 20110106531Abstract: This invention relates to retrieval for multimedia content, and provides a program endpoint time detection apparatus for detecting an endpoint time of a program by performing processing on audio signals of said program, comprising an audio classification unit for classifying said audio signals into a speech signal portion and a non-speech signal portion; a keyword retrieval unit for retrieving, as a candidate endpoint keyword, an endpoint keyword indicating start or end of the program from said speech signal portion; a content analysis unit for performing content analysis on context of the candidate endpoint keyword retrieved by the keyword retrieval unit to determine whether the candidate endpoint keyword is a valid endpoint keyword; and a program endpoint time determination unit for performing statistics analysis based on the retrieval result of said keyword retrieval unit and the determination result of said content analysis unit, and determining the endpoint time of the program.Type: ApplicationFiled: October 28, 2010Publication date: May 5, 2011Applicants: SONY CORPORATION, Institute of Acoustics, Chinese Academy of Scienc.Inventors: Kun LIU, Weiguo Wu, Li Lu, Qingwei Zhao, Yonghong Yan, Hongbin Suo
-
Publication number: 20100268531Abstract: A DTX decision method includes: obtaining sub-band signal(s) according to an input signal; obtaining a variation of characteristic information of each of the sub-band signals; and performing DTX decision according to the variation of the characteristic information of each of the sub-band signals. With the invention, a complete and appreciate DTX decision result is obtained by making full use of the noise characteristic in the speech encoding/decoding bandwidth and using band-splitting and layered processing. As a result, the SID encoding/CNG decoding may closely follow the characteristic variation of the actual noise.Type: ApplicationFiled: April 20, 2010Publication date: October 21, 2010Applicant: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Jinliang Dai, Eyal Shlomot, Deming Zhang
-
Patent number: 7809554Abstract: An apparatus, method, and medium for detecting a voiced sound and an unvoiced sound. The apparatus includes a blocking unit for dividing an input signal into block units; a parameter calculator for calculating a first parameter to determine the voiced sound and a second parameter to determine the unvoiced sound by using a slope and spectral flatness measure (SFM) of a mel-scaled filter bank spectrum of an input signal existing in a block; and a determiner for determining a voiced sound zone and an unvoiced sound zone in the block by comparing the first and second parameters to predetermined threshold values.Type: GrantFiled: February 7, 2005Date of Patent: October 5, 2010Assignee: Samsung Electronics Co., Ltd.Inventor: Kwangcheol Oh
-
Publication number: 20100250246Abstract: A speech signal evaluation apparatus includes: an acquisition unit that acquires, as a first frame, a speech signal of a specified length from speech signals; a first detection unit that detects, on the basis of a speech condition, whether the first frame is voiced or unvoiced; a variation calculation unit that, when the first frame is unvoiced, calculates a variation in a spectrum associated with the first frame on the basis of a spectrum of the first frame and a spectrum of a second frame that is unvoiced and precedes the first frame in time; and a second detection unit that detects, on the basis of a non-stationary condition based on the variation in spectrum, whether the variation of the first frame satisfies the non-stationary condition.Type: ApplicationFiled: March 24, 2010Publication date: September 30, 2010Applicant: FUJITSU LIMITEDInventor: Chikako MATSUMOTO
-
Publication number: 20100208918Abstract: A volume correction device includes: a variable gain means for controlling a gain, given to an input audio signal, according to a gain control signal; a consecutive relevant sounds interval detection means for detecting a consecutive relevant sounds interval, during which a group of temporally adjoining consecutive relevant sounds is present, in the input audio signal; a mean level detection means for detecting the mean level of the input audio signal attained during the consecutive relevant sounds interval, and whose time constant for mean level detection is set to a smaller value during the leading period of the consecutive relevant sounds interval than during the remaining period; a gain control signal production means for producing the gain control signal, so that the mean level will be equal to a reference level, and feeding the gain control signal to the variable gain means.Type: ApplicationFiled: February 8, 2010Publication date: August 19, 2010Applicant: Sony CorporationInventor: Masayoshi Noguchi
-
Publication number: 20100204985Abstract: A warping factor estimation system comprises label information generation unit that outputs voice/non-voice label information, warp model storage unit in which a probability model representing voice and non-voice occurrence probabilities is stored, and warp estimation unit that calculates a warping factor in the frequency axis direction using the probability model representing voice and non-voice occurrence probabilities, voice and non-voice labels, and a cepstrum.Type: ApplicationFiled: September 22, 2008Publication date: August 12, 2010Inventor: Tadashi Emori
-
Publication number: 20100145690Abstract: A sound signal generating method includes: generating, using a computer, a plurality of unit waveform signals by dividing the original sound signal having a periodic length of repeating similar waveforms by the length of the waveform; generating, using a computer, a repetitive waveform signal for each of the generated unit waveform signals by repeating the waveform of the unit waveform signal a given number of times; and generating, using a computer, an outputsound signal by shifting each of the repetitive waveform signals in each length with a sequence in which the unit waveform signals form the original sound signal and then superimposing on one another.Type: ApplicationFiled: February 10, 2010Publication date: June 10, 2010Applicant: FUJITSU LIMITEDInventor: Kazuhiro Watanabe
-
Publication number: 20100145688Abstract: An apparatus and a method to encode and decode a speech signal using an encoding mode are provided. An encoding apparatus may select an encoding mode of a frame included in an input speech signal, and encode a frame having an unvoiced mode for an unvoiced speech as the selected encoding mode.Type: ApplicationFiled: December 4, 2009Publication date: June 10, 2010Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ho Sang Sung, Ki Hyun Choo, Jung Hoe Kim, Eun Mi Oh
-
Publication number: 20100125452Abstract: A method of refining a pitch period estimation of a signal, the method comprising: for each of a plurality of portions of the signal, scanning over a predefined range of time offsets to find an estimate of the pitch period of the portion within the predefined range of time offsets; identifying the average pitch period of the estimated pitch periods of the portions; determining a refined range of time offsets in dependence on the average pitch period, the refined range of time offsets being narrower than the predefined range of time offsets; and for a subsequent portion of the signal, scanning over the refined range of time offsets to find an estimate of the pitch period of the subsequent portion.Type: ApplicationFiled: November 19, 2008Publication date: May 20, 2010Applicant: Cambridge Silicon Radio LimitedInventor: Xuejing Sun
-
Publication number: 20100108065Abstract: A breathing mask adapted to be placed over a wearer's face, comprises a mask body including a gas inlet port to be disposed in flow communication with the wearer's breathing passage for flow of a gas in a predetermined flow stream there through upon inhalation by the wearer; a communications microphone (30) mounted to said mask body to capture the voice of the wearer, said communications microphone generating sound signals; an attenuation device (34) for attenuating said sound signals; a sound monitor (36) for monitoring the intensity of sound near the communications microphone in a predetermined frequency range, connected to a controller device (38) for activating the attenuation device when the sound intensity monitored by the sound monitor is in a predetermined level range.Type: ApplicationFiled: January 4, 2007Publication date: May 6, 2010Inventors: Paul Zimmerman, Przemyslaw Gostkiewicz, Leopoldine Bachelard
-
Publication number: 20100094621Abstract: A method and system for assessing script running time are disclosed. According to one embodiment, a computer implemented method comprises receiving a first input from a client, wherein the first input comprises a script, the script comprising spoken audio and video elements. A second input is received from the client, wherein the second input comprises a value of words per minute. A third input is received from the client, wherein the third input comprises a non-verbal duration. A duration of the spoken audio of the script is calculated by multiplying the value of words per minute and the spoken content of the script. A duration of the script is calculated by summing the duration of the spoken audio and the non-verbal duration and the duration of the script is returned to the client.Type: ApplicationFiled: September 17, 2009Publication date: April 15, 2010Inventors: Seth Kenvin, Neal Clark, Jeremy Gailor, Michael Dungan
-
Publication number: 20100017202Abstract: Provided is a method and apparatus for determining a signal coding mode. The signal coding mode may be determined or changed according to whether a current frame corresponds to a silence period and by using a history of speech or music presence possibilities.Type: ApplicationFiled: July 9, 2009Publication date: January 21, 2010Applicant: SAMSUNG ELECTRONICS Co., LTDInventors: Ho-sang Sung, Jie Zhan, Ki-hyun Choo
-
Publication number: 20090292533Abstract: Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage.Type: ApplicationFiled: May 22, 2009Publication date: November 26, 2009Applicant: Accenture Global Services GmbHInventors: Thomas J. Ryan, Biji K. Janan
-
Publication number: 20090271197Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a region of the signal representing speech. The region can comprise a portion of a frame of the signal representing speech classified as a voiced frame. The region can be marked based on one or more pitch estimates for the region. A cord can be identified within the region based on occurrence of one or more events within the region of the signal. For example, the one or more events can comprise one or more glottal pulses. In such cases, cord can begin with onset of a first glottal pulse and extend to a point prior to an onset of a second glottal pulse. The cord may exclude a portion of the region of the signal prior to the onset of the second glottal pulse.Type: ApplicationFiled: October 23, 2008Publication date: October 29, 2009Applicant: Red Shift Company, LLCInventors: Joel K. Nyquist, Erik N. Reckase, Matthew D. Robinson, John F. Remillard
-
Publication number: 20090271198Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a first frame of the signal, the first frame comprising a voiced frame. One or more cords can be extracted from the voiced frame based on occurrence of one or more events within the frame. For example, the one or more events can comprise one or more glottal pulses. The one or more cords can collectively comprise less than all of the frame. For example, each of the cords can begin with onset of a glottal pulse and extend to a point prior to an onset of neighboring glottal pulse but may exclude a portion of the frame prior to the onset of the neighboring glottal pulse. A phoneme for the voiced frame can be determined based on at least one of the extracted cords.Type: ApplicationFiled: October 23, 2008Publication date: October 29, 2009Applicant: Red Shift Company, LLCInventors: Joel K. Nyquist, Erik N. Reckase, Matthew D. Robinson, John F. Remillard
-
Publication number: 20090259460Abstract: A client for silence-based adaptive real-time voice and video (SAVV) transmission methods and systems, detects the activity of a voice stream of conversational speech and aggressively transmits the corresponding video frames if silence in the sending or receiving voice stream has been detected, and adaptively generates and transmits key frames of the video stream according to characteristics of the conversational speech. In one aspect, a coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice encoder of the SAVV client and the user's instructions. In another aspect, the coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice decoder of the SAVV client and the user's instructions. In one example, the coordination management module adaptively generates a key video frame when silence is detected in the receiving voice stream.Type: ApplicationFiled: April 10, 2008Publication date: October 15, 2009Applicant: CITY UNIVERSITY OF HONG KONGInventors: Weijia JIA, Lizhuo ZHANG, Huan LI, Wenyan LU
-
Publication number: 20090182556Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, a method of processing a signal representing speech can comprise receiving a frame of the signal representing speech, classifying the frame as a voiced frame, and parsing the voiced frame into one or more regions based on occurrence of one or more events within the voiced frame. For example, the one or more events can comprise one or more glottal pulses. The one or more regions may collectively represent less than all of the voiced frame.Type: ApplicationFiled: October 23, 2008Publication date: July 16, 2009Applicant: Red Shift Company, LLCInventors: Erik N. Reckase, John F. Remillard
-
Publication number: 20080243495Abstract: Packetized CELP-encoded speech playout with frame truncation during silence and frame expansion method dependent upon voicing classification with voiced frame expansion maintaining phasealignment.Type: ApplicationFiled: June 10, 2008Publication date: October 2, 2008Inventors: Krishnasamy Anandakumar, Alan V. McCree, Erdal Paksoy
-
Publication number: 20080243494Abstract: A speech receiving unit receives a user ID, a speech obtained at a terminal, and an utterance duration, from the terminal. A proximity determining unit calculates a correlation value expressing a correlation between speeches received from plural terminals, compares the correlation value with a first threshold value, and determines that the plural terminals that receive the speeches whose correlation value is calculated are close to each other, when the correlation value is larger than the first threshold value. A dialog detecting unit determines whether a relationship between the utterance durations received from the plural terminals that are determined to be close to each other within an arbitrarily target period during which a dialog is to be detected fits a rule. When the relationship is determined to fit the rule, the dialog detecting unit detects dialog information containing the target period and the user ID.Type: ApplicationFiled: March 11, 2008Publication date: October 2, 2008Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Masayuki Okamoto, Naoki Iketani, Hideo Umeki, Sogo Tsuboi, Kenta Cho, Keisuke Nishimura, Masanori Hattori
-
Publication number: 20080109217Abstract: An apparatus for providing control of voicing in processed speech includes a spectra approximation element and a comparing element. The spectra approximation element may be configured to compute a voiced contribution and an unvoiced contribution for each of a reference speech sample and a processed speech sample. The comparing element may be configured to compare indications of voiced and unvoiced contributions of the reference speech sample and indications of voiced and unvoiced contributions of the processed speech sample, and to determine whether to correct at least one of the voiced or unvoiced contributions of the processed speech sample based on the comparison.Type: ApplicationFiled: November 8, 2006Publication date: May 8, 2008Inventor: Jani K. Nurminen