Voiced-unvoiced Decision (epo) Patents (Class 704/E11.007)

SOUND SYNTHESIZING APPARATUS, SOUND PROCESSING APPARATUS, AND SOUND SYNTHESIZING METHOD

Publication number: 20130231928

Abstract: A sound synthesizing apparatus includes a waveform storing section which stores a plurality of unit waveforms extracted from different positions, on a time axis, of a sound waveform indicating a voiced sound, and a waveform generating section which generates a synthesized waveform by arranging the plurality of unit waveforms on the time axis.

Type: Application

Filed: August 30, 2012

Publication date: September 5, 2013

Applicant: Yamaha Corporation

Inventor: Hiraku KAYAMA
SYSTEM AND METHOD OF PROCESSING A SOUND SIGNAL INCLUDING TRANSFORMING THE SOUND SIGNAL INTO A FREQUENCY-CHIRP DOMAIN

Publication number: 20130041658

Abstract: A system and method may be configured to process an audio signal. The system and method may track pitch, chirp rate, and/or harmonic envelope across the audio signal, may reconstruct sound represented in the audio signal, and/or may segment or classify the audio signal. A transform may be performed on the audio signal to place the audio signal in a frequency chirp domain that enhances the sound parameter tracking, reconstruction, and/or classification.

Type: Application

Filed: August 8, 2011

Publication date: February 14, 2013

Applicant: The Intellisis Corporation

Inventors: David C. BRADLEY, Daniel S. GOLDIN, Robert N. HILTON, Nicholas K. FISHER, Rodney GATEAU, Derrick R. ROOS, Eric WIEWIORA
ATMOSPHERE EXPRESSION WORD SELECTION SYSTEM, ATMOSPHERE EXPRESSION WORD SELECTION METHOD, AND PROGRAM

Publication number: 20130024192

Abstract: Disclosed is an information display system provided with: a signal analyzing unit which analyzes the audio signals obtained from a predetermined location and which generates ambient sound information regarding the sound generated at the predetermined location; and an ambient expression selection unit which selects an ambient expression which expresses the content of what a person is feeling from the sound generated at the predetermined location on the basis of the ambient sound information.

Type: Application

Filed: March 28, 2011

Publication date: January 24, 2013

Applicant: NEC CORPORATION

Inventors: Toshiyuki Nomura, Yuzo Senda, Kyota Higa, Takayuki Arakawa, Yasuyuki Mitsui
VOICE ACTIVITY DETECTION METHOD AND APPARATUS, AND ELECTRONIC DEVICE

Publication number: 20120278068

Abstract: A voice activity detection method and apparatus, and an electronic device are provided. The method includes: obtaining a time domain parameter and a frequency domain parameter from an audio frame; obtaining a first distance between the time domain parameter and a long-term slip mean of the time domain parameter in a history background noise frame, and obtaining a second distance between the frequency domain parameter and a long-term slip mean of the frequency domain parameter in the history background noise frame; and determining whether the audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance and a set of decision inequalities based on the first distance and the second distance. The above technical solutions enable the determination criterion to have an adaptive adjustment capability, thus improving the performance of the voice activity detection.

Type: Application

Filed: July 11, 2012

Publication date: November 1, 2012

Applicant: Huawei Technologies Co., Ltd.

Inventor: Zhe Wang
SPEECH INPUT DEVICE, METHOD AND PROGRAM, AND COMMUNICATION APPARATUS

Publication number: 20120253796

Abstract: A sound is picked up by a microphone. A speech waveform signal is generated based on the picked up sound. A speech segment or a non-speech segment is detected based on the speech waveform signal. The speech segment corresponds to a voice input period during which a voice is input. The non-speech segment corresponds to a non-voice input period during which no voice is input. A determination signal is generated that indicates whether the picked up sound is the speech segment or the non-speech segment. A detected state of the speech segment is indicated based on the determination signal.

Type: Application

Filed: March 29, 2012

Publication date: October 4, 2012

Applicant: JVC KENWOOD Corporation a corporation of Japan

Inventor: Taichi MAJIMA
AUDIO SIGNAL PROCESSING METHOD AND DEVICE

Publication number: 20120239389

Abstract: Disclosed is an audio signal processing method comprising the steps of: receiving an audio signal containing current frame data; generating a first temporary output signal for the current frame when an error occurs in the current frame data, by carrying out frame error concealment with respect to the current frame data a random codebook; generating a parameter by carrying out one or more of short-term prediction, long-term prediction and a fixed codebook search based on the first temporary output signal; and memory updating the parameter for the next frame; wherein the parameter comprises one or more of pitch gain, pitch delay, fixed codebook gain and a fixed codebook.

Type: Application

Filed: November 24, 2010

Publication date: September 20, 2012

Applicant: LG ELECTRONICS INC.

Inventors: Hye Jeong Jeon, Dae Hwan Kim, Hong Goo Kang, Min Ki Lee, Byung Suk Lee, Gyu Hyeok Jeong
MUSIC DETECTION USING SPECTRAL PEAK ANALYSIS

Publication number: 20120158401

Abstract: In one embodiment, a music detection (MD) module accumulates sets of one or more frames and performs FFT processing on each set to recover a set of coefficients, each corresponding to a different frequency k. For each frame, the module identifies candidate musical tones by searching for peak values in the set of coefficients. If a coefficient corresponds to a peak, then a variable TONE[k] corresponding to the coefficient is set equal to one. Otherwise, the variable is set equal to zero. For each variable TONE[k] having a value of one, a corresponding accumulator A[k] is increased. Candidate musical tones that are short in duration are filtered out by comparing each accumulator A[k] to a minimum duration threshold. A determination is made as to whether or not music is present based on a number of candidate musical tones and a sum of candidate musical tone durations using a state machine.

Type: Application

Filed: August 9, 2011

Publication date: June 21, 2012

Applicant: LSI Corporation

Inventors: Ivan Leonidovich Mazurenko, Dmitry Nikolaevich Babin, Alexander Markovic, Denis Vladimirovich Parkhomenko, Alexander Alexandrovich Petyushko
Coding Generic Audio Signals at Low Bitrates and Low Delay

Publication number: 20120101813

Abstract: A mixed time-domain/frequency-domain coding device and method for coding an input sound signal, wherein a time-domain excitation contribution is calculated in response to the input sound signal. A cut-off frequency for the time-domain excitation contribution is also calculated in response to the input sound signal, and a frequency extent of the time-domain excitation contribution is adjusted in relation to this cut-off frequency. Following calculation of a frequency-domain excitation contribution in response to the input sound signal, the adjusted time-domain excitation contribution and the frequency-domain excitation contribution are added to form a mixed time-domain/frequency-domain excitation constituting a coded version of the input sound signal. In the calculation of the time-domain excitation contribution, the input sound signal may be processed in successive frames of the input sound signal and a number of sub-frames to be used in a current frame may be calculated.

Type: Application

Filed: October 25, 2011

Publication date: April 26, 2012

Applicant: VOICEAGE CORPORATION

Inventors: Tommy Vaillancourt, Milan Jelinek
ESTIMATION OF SPEECH MODEL PARAMETERS

Publication number: 20120089391

Abstract: Methods for estimating speech model parameters are disclosed. For pulsed parameter estimation, a speech signal is divided into multiple frequency bands or channels using bandpass filters. Channel processing reduces sensitivity to pole magnitudes and frequencies and reduces impulse response time duration to improve pulse location and strength estimation performance. These methods are useful for high quality speech coding and reproduction at various bit rates for applications such as satellite and cellular voice communication.

Type: Application

Filed: October 7, 2011

Publication date: April 12, 2012

Applicant: Digital Voice Systems, Inc.

Inventor: Daniel W. Griffin
Machine for Enabling and Disabling Noise Reduction (MEDNR) Based on a Threshold

Publication number: 20120084080

Abstract: The present invention provides a novel system and method for monitoring the audio signals, analyze selected audio signal components, compare the results of analysis with a threshold value, and enable or disable noise reduction capability of a communication device.

Type: Application

Filed: April 8, 2011

Publication date: April 5, 2012

Applicant: ALON KONCHITSKY

Inventors: Alon Konchitsky, Alberto D Berstein, Sandeep Kulakcherla
Adaptive Voice Activity Detection

Publication number: 20120084082

Abstract: Encoding audio signals with selecting an encoding mode for encoding the signal categorizing the signal into active segments having voice activity and non-active segments having substantially no voice activity by using categorization parameters depending on the selected encoding mode and encoding at least the active segments using the selected encoding mode.

Type: Application

Filed: September 29, 2011

Publication date: April 5, 2012

Applicant: Nokia Corporation

Inventors: Kari Järvinen, Pasi Ojala, Ari Lakaniemi
Method and Apparatus for Audio Source Separation

Publication number: 20110282658

Abstract: The present invention relates to co-channel audio source separation. In one embodiment a first frequency-related representation of plural regions of the acoustic signal is prepared over time, and a two-dimensional transform of plural two-dimensional localized regions of the first frequency-related representation, each less than an entire frequency range of the first frequency related representation, is obtained to provide a two-dimensional compressed frequency-related representation with respect to each two dimensional localized region. For each of the plural regions, at least one pitch is identified. The pitch from the plural regions is processed to provide multiple pitch estimates over time. In another embodiment, a mixed acoustic signal is processed by localizing multiple time-frequency regions of a spectrogram of the mixed acoustic signal to obtain one or more acoustic properties.

Type: Application

Filed: September 3, 2010

Publication date: November 17, 2011

Applicant: Massachusetts Institute of Technology

Inventors: Tianyu Wang, Thomas R. Quatieri, JR.
SYSTEMS, METHODS, AND APPARATUS FOR SPEECH FEATURE DETECTION

Publication number: 20110264447

Abstract: Implementations and applications are disclosed for detection of a transition in a voice activity state of an audio signal, based on a change in energy that is consistent in time across a range of frequencies of the signal.

Type: Application

Filed: April 22, 2011

Publication date: October 27, 2011

Applicant: QUALCOMM Incorporated

Inventors: Erik Visser, Ian Ernan Liu, Jongwon Shin
DICTATION CLIENT FEEDBACK TO FACILITATE AUDIO QUALITY

Publication number: 20110246189

Abstract: An audio quality feedback system and method is provided. The system receives audio from a client via a communication device such as a microphone, The audio quality feedback system compares the received audio to one or more parameters regarding the quality of the feedback. The parameters include, for example, clipping, periods of silence, signal to noise ratios. Based on the comparison, feedback is generated to allow adjustment of the communication device or use of the communication device to improve the quality of the audio.

Type: Application

Filed: March 21, 2011

Publication date: October 6, 2011

Applicant: nVoq Incorporated

Inventors: Peter Fox, Michael Clark, Jarek Foltynski
METHOD FOR ERROR CONCEALMENT IN THE TRANSMISSION OF SPEECH DATA WITH ERRORS

Publication number: 20110218801

Abstract: The invention relates to a method for outputting a speech signal. Speech signal frames are received and are used in a predetermined sequence in order to produce a speech signal to be output. If one speech signal frame to be received is not received, then a substitute speech signal frame is used in its place, which is produced as a function of a previously received speech signal frame. According to the invention, in the situation in which the previously received speech signal frame has a voiceless speech signal, the substitute speech signal frame is produced by means of a noise signal.

Type: Application

Filed: September 28, 2009

Publication date: September 8, 2011

Applicant: ROBERT BOSCH GMBH

Inventors: Peter Vary, Frank Mertz
OPERATING METHOD FOR VOICE ACTIVITY DETECTION/SILENCE SUPPRESSION SYSTEM

Publication number: 20110196675

Abstract: A VAD/SS system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.

Type: Application

Filed: March 17, 2011

Publication date: August 11, 2011

Applicant: AT&T CORPORATION

Inventors: Bing Chen, James H. James
GENDER DETECTION IN MOBILE PHONES

Publication number: 20110153317

Abstract: An apparatus for wireless communications includes a processing system. The processing system is configured to receive an input sound stream of a user, split the input sound stream into a plurality of frames, classify each of the frames as one selected from the group consisting of a non-speech frame and a speech frame, determine a pitch of each of the frames in a subset of the speech frames, and identify a gender of the user from the determined pitch. To determine the pitch, the processing system is configured to filter the speech frames to compute an error signal, compute an autocorrelation of the error signal, find a maximum autocorrelation value, and set the pitch to an index of the maximum autocorrelation value.

Type: Application

Filed: December 23, 2009

Publication date: June 23, 2011

Applicant: QUALCOMM INCORPORATED

Inventors: Yinian Mao, Gene Marsh
Method and system for speech bandwidth extension

Publication number: 20110153318

Abstract: There is provided a method or a device for extending a bandwidth of a first band speech signal to generate a second band speech signal wider than the first band speech signal and including the first band speech signal. The method comprises receiving a segment of the first band speech signal having a low cut off frequency and a high cut off frequency; determining the high cut off frequency of the segment; determining whether the segment is voiced or unvoiced; if the segment is voiced, applying a first bandwidth extension function to the segment to generate a first bandwidth extension in high frequencies; if the segment is unvoiced, applying a second bandwidth extension function to the segment to generate a second bandwidth extension in the high frequencies; using the first bandwidth extension and the second bandwidth extension to extend the first band speech signal beyond the high cut off frequency.

Type: Application

Filed: March 15, 2010

Publication date: June 23, 2011

Applicant: MINDSPEED TECHNOLOGIES, INC.

Inventors: Norbert Rossello, Fabien Klein
PROGRAM ENDPOINT TIME DETECTION APPARATUS AND METHOD, AND PROGRAM INFORMATION RETRIEVAL SYSTEM

Publication number: 20110106531

Abstract: This invention relates to retrieval for multimedia content, and provides a program endpoint time detection apparatus for detecting an endpoint time of a program by performing processing on audio signals of said program, comprising an audio classification unit for classifying said audio signals into a speech signal portion and a non-speech signal portion; a keyword retrieval unit for retrieving, as a candidate endpoint keyword, an endpoint keyword indicating start or end of the program from said speech signal portion; a content analysis unit for performing content analysis on context of the candidate endpoint keyword retrieved by the keyword retrieval unit to determine whether the candidate endpoint keyword is a valid endpoint keyword; and a program endpoint time determination unit for performing statistics analysis based on the retrieval result of said keyword retrieval unit and the determination result of said content analysis unit, and determining the endpoint time of the program.

Type: Application

Filed: October 28, 2010

Publication date: May 5, 2011

Applicants: SONY CORPORATION, Institute of Acoustics, Chinese Academy of Scienc.

Inventors: Kun LIU, Weiguo Wu, Li Lu, Qingwei Zhao, Yonghong Yan, Hongbin Suo
Method and device for DTX decision

Publication number: 20100268531

Abstract: A DTX decision method includes: obtaining sub-band signal(s) according to an input signal; obtaining a variation of characteristic information of each of the sub-band signals; and performing DTX decision according to the variation of the characteristic information of each of the sub-band signals. With the invention, a complete and appreciate DTX decision result is obtained by making full use of the noise characteristic in the speech encoding/decoding bandwidth and using band-splitting and layered processing. As a result, the SID encoding/CNG decoding may closely follow the characteristic variation of the actual noise.

Type: Application

Filed: April 20, 2010

Publication date: October 21, 2010

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Jinliang Dai, Eyal Shlomot, Deming Zhang
Apparatus, method and medium for detecting voiced sound and unvoiced sound

Patent number: 7809554

Abstract: An apparatus, method, and medium for detecting a voiced sound and an unvoiced sound. The apparatus includes a blocking unit for dividing an input signal into block units; a parameter calculator for calculating a first parameter to determine the voiced sound and a second parameter to determine the unvoiced sound by using a slope and spectral flatness measure (SFM) of a mel-scaled filter bank spectrum of an input signal existing in a block; and a determiner for determining a voiced sound zone and an unvoiced sound zone in the block by comparing the first and second parameters to predetermined threshold values.

Type: Grant

Filed: February 7, 2005

Date of Patent: October 5, 2010

Assignee: Samsung Electronics Co., Ltd.

Inventor: Kwangcheol Oh
SPEECH SIGNAL EVALUATION APPARATUS, STORAGE MEDIUM STORING SPEECH SIGNAL EVALUATION PROGRAM, AND SPEECH SIGNAL EVALUATION METHOD

Publication number: 20100250246

Abstract: A speech signal evaluation apparatus includes: an acquisition unit that acquires, as a first frame, a speech signal of a specified length from speech signals; a first detection unit that detects, on the basis of a speech condition, whether the first frame is voiced or unvoiced; a variation calculation unit that, when the first frame is unvoiced, calculates a variation in a spectrum associated with the first frame on the basis of a spectrum of the first frame and a spectrum of a second frame that is unvoiced and precedes the first frame in time; and a second detection unit that detects, on the basis of a non-stationary condition based on the variation in spectrum, whether the variation of the first frame satisfies the non-stationary condition.

Type: Application

Filed: March 24, 2010

Publication date: September 30, 2010

Applicant: FUJITSU LIMITED

Inventor: Chikako MATSUMOTO
VOLUME CORRECTION DEVICE, VOLUME CORRECTION METHOD, VOLUME CORRECTION PROGRAM, AND ELECTRONIC EQUIPMENT

Publication number: 20100208918

Abstract: A volume correction device includes: a variable gain means for controlling a gain, given to an input audio signal, according to a gain control signal; a consecutive relevant sounds interval detection means for detecting a consecutive relevant sounds interval, during which a group of temporally adjoining consecutive relevant sounds is present, in the input audio signal; a mean level detection means for detecting the mean level of the input audio signal attained during the consecutive relevant sounds interval, and whose time constant for mean level detection is set to a smaller value during the leading period of the consecutive relevant sounds interval than during the remaining period; a gain control signal production means for producing the gain control signal, so that the mean level will be equal to a reference level, and feeding the gain control signal to the variable gain means.

Type: Application

Filed: February 8, 2010

Publication date: August 19, 2010

Applicant: Sony Corporation

Inventor: Masayoshi Noguchi
FREQUENCY AXIS WARPING FACTOR ESTIMATION APPARATUS, SYSTEM, METHOD AND PROGRAM

Publication number: 20100204985

Abstract: A warping factor estimation system comprises label information generation unit that outputs voice/non-voice label information, warp model storage unit in which a probability model representing voice and non-voice occurrence probabilities is stored, and warp estimation unit that calculates a warping factor in the frequency axis direction using the probability model representing voice and non-voice occurrence probabilities, voice and non-voice labels, and a cepstrum.

Type: Application

Filed: September 22, 2008

Publication date: August 12, 2010

Inventor: Tadashi Emori
Method and apparatus for encoding/decoding speech signal using coding mode

Publication number: 20100145688

Abstract: An apparatus and a method to encode and decode a speech signal using an encoding mode are provided. An encoding apparatus may select an encoding mode of a frame included in an input speech signal, and encode a frame having an unvoiced mode for an unvoiced speech as the selected encoding mode.

Type: Application

Filed: December 4, 2009

Publication date: June 10, 2010

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ho Sang Sung, Ki Hyun Choo, Jung Hoe Kim, Eun Mi Oh
SOUND SIGNAL GENERATING METHOD, SOUND SIGNAL GENERATING DEVICE, AND RECORDING MEDIUM

Publication number: 20100145690

Abstract: A sound signal generating method includes: generating, using a computer, a plurality of unit waveform signals by dividing the original sound signal having a periodic length of repeating similar waveforms by the length of the waveform; generating, using a computer, a repetitive waveform signal for each of the generated unit waveform signals by repeating the waveform of the unit waveform signal a given number of times; and generating, using a computer, an outputsound signal by shifting each of the repetitive waveform signals in each length with a sequence in which the unit waveform signals form the original sound signal and then superimposing on one another.

Type: Application

Filed: February 10, 2010

Publication date: June 10, 2010

Applicant: FUJITSU LIMITED

Inventor: Kazuhiro Watanabe
PITCH RANGE REFINEMENT

Publication number: 20100125452

Abstract: A method of refining a pitch period estimation of a signal, the method comprising: for each of a plurality of portions of the signal, scanning over a predefined range of time offsets to find an estimate of the pitch period of the portion within the predefined range of time offsets; identifying the average pitch period of the estimated pitch periods of the portions; determining a refined range of time offsets in dependence on the average pitch period, the refined range of time offsets being narrower than the predefined range of time offsets; and for a subsequent portion of the signal, scanning over the refined range of time offsets to find an estimate of the pitch period of the subsequent portion.

Type: Application

Filed: November 19, 2008

Publication date: May 20, 2010

Applicant: Cambridge Silicon Radio Limited

Inventor: Xuejing Sun
ACOUSTIC SENSOR FOR USE IN BREATHING MASKS

Publication number: 20100108065

Abstract: A breathing mask adapted to be placed over a wearer's face, comprises a mask body including a gas inlet port to be disposed in flow communication with the wearer's breathing passage for flow of a gas in a predetermined flow stream there through upon inhalation by the wearer; a communications microphone (30) mounted to said mask body to capture the voice of the wearer, said communications microphone generating sound signals; an attenuation device (34) for attenuating said sound signals; a sound monitor (36) for monitoring the intensity of sound near the communications microphone in a predetermined frequency range, connected to a controller device (38) for activating the attenuation device when the sound intensity monitored by the sound monitor is in a predetermined level range.

Type: Application

Filed: January 4, 2007

Publication date: May 6, 2010

Inventors: Paul Zimmerman, Przemyslaw Gostkiewicz, Leopoldine Bachelard
System and Method for Assessing Script Running Time

Publication number: 20100094621

Abstract: A method and system for assessing script running time are disclosed. According to one embodiment, a computer implemented method comprises receiving a first input from a client, wherein the first input comprises a script, the script comprising spoken audio and video elements. A second input is received from the client, wherein the second input comprises a value of words per minute. A third input is received from the client, wherein the third input comprises a non-verbal duration. A duration of the spoken audio of the script is calculated by multiplying the value of words per minute and the spoken content of the script. A duration of the script is calculated by summing the duration of the spoken audio and the non-verbal duration and the duration of the script is returned to the client.

Type: Application

Filed: September 17, 2009

Publication date: April 15, 2010

Inventors: Seth Kenvin, Neal Clark, Jeremy Gailor, Michael Dungan
Method and apparatus for determining coding mode

Publication number: 20100017202

Abstract: Provided is a method and apparatus for determining a signal coding mode. The signal coding mode may be determined or changed according to whether a current frame corresponds to a silence period and by using a history of speech or music presence possibilities.

Type: Application

Filed: July 9, 2009

Publication date: January 21, 2010

Applicant: SAMSUNG ELECTRONICS Co., LTD

Inventors: Ho-sang Sung, Jie Zhan, Ki-hyun Choo
TREATMENT PROCESSING OF A PLURALITY OF STREAMING VOICE SIGNALS FOR DETERMINATION OF A RESPONSIVE ACTION THERETO

Publication number: 20090292533

Abstract: Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage.

Type: Application

Filed: May 22, 2009

Publication date: November 26, 2009

Applicant: Accenture Global Services GmbH

Inventors: Thomas J. Ryan, Biji K. Janan
PRODUCING PHONITOS BASED ON FEATURE VECTORS

Publication number: 20090271198

Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a first frame of the signal, the first frame comprising a voiced frame. One or more cords can be extracted from the voiced frame based on occurrence of one or more events within the frame. For example, the one or more events can comprise one or more glottal pulses. The one or more cords can collectively comprise less than all of the frame. For example, each of the cords can begin with onset of a glottal pulse and extend to a point prior to an onset of neighboring glottal pulse but may exclude a portion of the frame prior to the onset of the neighboring glottal pulse. A phoneme for the voiced frame can be determined based on at least one of the extracted cords.

Type: Application

Filed: October 23, 2008

Publication date: October 29, 2009

Applicant: Red Shift Company, LLC

Inventors: Joel K. Nyquist, Erik N. Reckase, Matthew D. Robinson, John F. Remillard
IDENTIFYING FEATURES IN A PORTION OF A SIGNAL REPRESENTING SPEECH

Publication number: 20090271197

Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a region of the signal representing speech. The region can comprise a portion of a frame of the signal representing speech classified as a voiced frame. The region can be marked based on one or more pitch estimates for the region. A cord can be identified within the region based on occurrence of one or more events within the region of the signal. For example, the one or more events can comprise one or more glottal pulses. In such cases, cord can begin with onset of a first glottal pulse and extend to a point prior to an onset of a second glottal pulse. The cord may exclude a portion of the region of the signal prior to the onset of the second glottal pulse.

Type: Application

Filed: October 23, 2008

Publication date: October 29, 2009

Applicant: Red Shift Company, LLC

Inventors: Joel K. Nyquist, Erik N. Reckase, Matthew D. Robinson, John F. Remillard
SILENCE-BASED ADAPTIVE REAL-TIME VOICE AND VIDEO TRANSMISSION METHODS AND SYSTEM

Publication number: 20090259460

Abstract: A client for silence-based adaptive real-time voice and video (SAVV) transmission methods and systems, detects the activity of a voice stream of conversational speech and aggressively transmits the corresponding video frames if silence in the sending or receiving voice stream has been detected, and adaptively generates and transmits key frames of the video stream according to characteristics of the conversational speech. In one aspect, a coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice encoder of the SAVV client and the user's instructions. In another aspect, the coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice decoder of the SAVV client and the user's instructions. In one example, the coordination management module adaptively generates a key video frame when silence is detected in the receiving voice stream.

Type: Application

Filed: April 10, 2008

Publication date: October 15, 2009

Applicant: CITY UNIVERSITY OF HONG KONG

Inventors: Weijia JIA, Lizhuo ZHANG, Huan LI, Wenyan LU
PITCH ESTIMATION AND MARKING OF A SIGNAL REPRESENTING SPEECH

Publication number: 20090182556

Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, a method of processing a signal representing speech can comprise receiving a frame of the signal representing speech, classifying the frame as a voiced frame, and parsing the voiced frame into one or more regions based on occurrence of one or more events within the voiced frame. For example, the one or more events can comprise one or more glottal pulses. The one or more regions may collectively represent less than all of the voiced frame.

Type: Application

Filed: October 23, 2008

Publication date: July 16, 2009

Applicant: Red Shift Company, LLC

Inventors: Erik N. Reckase, John F. Remillard
DIALOG DETECTING APPARATUS, DIALOG DETECTING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20080243494

Abstract: A speech receiving unit receives a user ID, a speech obtained at a terminal, and an utterance duration, from the terminal. A proximity determining unit calculates a correlation value expressing a correlation between speeches received from plural terminals, compares the correlation value with a first threshold value, and determines that the plural terminals that receive the speeches whose correlation value is calculated are close to each other, when the correlation value is larger than the first threshold value. A dialog detecting unit determines whether a relationship between the utterance durations received from the plural terminals that are determined to be close to each other within an arbitrarily target period during which a dialog is to be detected fits a rule. When the relationship is determined to fit the rule, the dialog detecting unit detects dialog information containing the target period and the user ID.

Type: Application

Filed: March 11, 2008

Publication date: October 2, 2008

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masayuki Okamoto, Naoki Iketani, Hideo Umeki, Sogo Tsuboi, Kenta Cho, Keisuke Nishimura, Masanori Hattori
Adaptive Voice Playout in VOP

Publication number: 20080243495

Abstract: Packetized CELP-encoded speech playout with frame truncation during silence and frame expansion method dependent upon voicing classification with voiced frame expansion maintaining phasealignment.

Type: Application

Filed: June 10, 2008

Publication date: October 2, 2008

Inventors: Krishnasamy Anandakumar, Alan V. McCree, Erdal Paksoy
Method, Apparatus and Computer Program Product for Controlling Voicing in Processed Speech

Publication number: 20080109217

Abstract: An apparatus for providing control of voicing in processed speech includes a spectra approximation element and a comparing element. The spectra approximation element may be configured to compute a voiced contribution and an unvoiced contribution for each of a reference speech sample and a processed speech sample. The comparing element may be configured to compare indications of voiced and unvoiced contributions of the reference speech sample and indications of voiced and unvoiced contributions of the processed speech sample, and to determine whether to correct at least one of the voiced or unvoiced contributions of the processed speech sample based on the comparison.

Type: Application

Filed: November 8, 2006

Publication date: May 8, 2008

Inventor: Jani K. Nurminen