Pitch Determination Of Speech Signals (epo) Patents (Class 704/E11.006)
-
Publication number: 20110054886Abstract: An effect device may be configured such that when an input audio signal switches from a consonant to a vowel and an input level of the switched vowel is greater than a threshold value Lc (and a variable t is greater than time Ts), an audio effect signal A may be generated. Such an effect device may allow for increasing the occurrences when portamento is simulated, while still sounding natural. In general, a detecting module detects whether an audio signal is a vowel sound or a consonant sound and whether the audio signal changed from a consonant sound to a vowel sound; and a pitch change module changes a pitch of the audio signal and changes, based on a prescribed function, an amount the pitch is changed to produce a modified audio signal, when the audio signal changed from a consonant sound to a vowel sound.Type: ApplicationFiled: August 30, 2010Publication date: March 3, 2011Inventor: Takahiro Ae
-
Publication number: 20110004467Abstract: Systems, methods, and computer program products are provided for producing audio and/or visual effects according to a correlation between reference data and estimated note data derived from an input acoustic audio waveform. Some embodiments calculate a pitch score as a function of a pitch estimate derived from the input waveform, a reference pitch, and a real-time-adjustable pitch gating window. Other embodiments calculate the pitch score as a function of pitch and timing estimates derived from the input waveform, reference pitch and note timing data, an adjustable rhythm gating window, and an adjustable pitch gating window. The audio and/or visual effects are produced according to the pitch score, and may be used to generate outputs (e.g., in real time) for affecting a live performance, an audio mix, a video gaming environment, an educational feedback environment, etc.Type: ApplicationFiled: June 30, 2010Publication date: January 6, 2011Applicant: MuseAmi, Inc.Inventors: Robert D. Taub, J. Alexander Cabanilla, Jonathan Sheldrick, George Tourtellot
-
Publication number: 20100305944Abstract: A method of estimating a pitch period of a first portion of a signal wherein the first portion overlaps a previous portion. The method comprises computing a first autocorrelation value for part of the first portion not overlapping the previous portion. The method further comprises retrieving a stored second autocorrelation value for part of the first portion overlapping the previous portion, the second autocorrelation value having been computed during estimation of a pitch period of the previous portion. The method further comprises forming a combined autocorrelation value using the first and second autocorrelation values, and selecting the estimated pitch period in dependence on the combined autocorrelation value.Type: ApplicationFiled: May 28, 2009Publication date: December 2, 2010Applicant: Cambridge Silicon Radio LimitedInventor: Xuejing Sun
-
Publication number: 20100241424Abstract: There is provided a speech encoder for performing an algorithm that comprises obtaining (205) a plurality of open-loop pitch candidates from a current frame of a speech signal, the plurality of open-loop pitch candidates including a first open-loop pitch candidate and a second open-loop pitch candidate; obtaining (205) a voicing information from one or more previous frames; and selecting (280) one of the plurality of open-loop pitch candidates as a final pitch of the current frame using the voicing information from the one or more previous frames. In one aspect, the voicing information from the one or more previous frames includes a previous pitch of the one or more previous frames. In a further aspect, selecting the final pitch of the current frame includes selecting (210) an initial open-loop pitch from that has the maximum long-term correlation value.Type: ApplicationFiled: October 27, 2006Publication date: September 23, 2010Applicant: MINDSPEED TECHNOLOGIES, INC.Inventor: Yang Gao
-
Publication number: 20100241423Abstract: Embodiments of a system and method for encoding audio data have been described. In one embodiment, the method includes transforming frequency domain data in a plurality of signal windows of an audio dataset from a cosine/sine format to a magnitude/cosine/sine format. The magnitude/cosine/sine format disproportionately represents a magnitude of the frequency domain data over a phase of the frequency domain data. The above transformation may be a pre-processing stage of vector quantization usable to produce a codebook.Type: ApplicationFiled: March 18, 2009Publication date: September 23, 2010Inventors: Stanley Wayne Jackson, Jay T. Dresser
-
Publication number: 20100235166Abstract: A method of audio processing comprises composing one or more transformation profiles for transforming audio characteristics of an audio recording and then generating for the or each transformation profile, a metadata set comprising transformation profile data and location data indicative of where in the recording the transformation profile data is to be applied; the or each metadata set is then stored in association with the corresponding recording. A corresponding method of audio reproduction comprises reading a recording and a meta-data set associated with that recording from storage, applying transformations to the recording data in accordance with the metadata set transformation profile; and then outputting the transformed recording.Type: ApplicationFiled: October 17, 2007Publication date: September 16, 2010Applicant: SONY COMPUTER ENTERTAINMENT EUROPE LIMITEDInventors: Daniele Giuseppe Bardino, Richard James Griffiths
-
Publication number: 20100211384Abstract: A pitch detection method and apparatus are disclosed. The method includes: performing pitch detection on an input signal in a signal domain, and obtaining a candidate pitch; performing linear prediction (LP) on the input signal, and obtaining an LP residual signal; setting a candidate pitch range that includes the candidate pitch; searching the candidate pitch range for the LP residual signal, and obtaining a selected pitch.Type: ApplicationFiled: April 9, 2010Publication date: August 19, 2010Inventors: Fengyan Qi, Dejun Zhang, Lei Miao, Jianfeng Xu, Herve Marcel Taddei, Qing Zhang, Yang Gao
-
Publication number: 20100191524Abstract: A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value.Type: ApplicationFiled: April 5, 2010Publication date: July 29, 2010Applicant: FUJITSU LIMITEDInventors: Nobuyuki Washio, Shoji Hayakawa
-
Publication number: 20100185440Abstract: The embodiments of a transcoding method, a transcoding device, and a communication apparatus are provided. The embodiment of a method includes: receiving a bit stream input from a sending end; determining an attribute of discontinuous transmission (DTX) used by a receiving end and a frame type of the input bit stream; and transcoding the input bit stream in a corresponding processing manner according to a determination result. Thereby, a corresponding transcoding operation is performed on the input bit stream according to the attribute of DTX used by the receiving end and the frame type of the input bit stream. In such a manner, input bit streams of various types can be processed, and the input bit streams can be correspondingly transcoded according to the requirements of the receiving end. Therefore, the average computational complexity and peak computational complexity can be effectively decreased without decreasing the quality of the synthesized speech.Type: ApplicationFiled: January 21, 2010Publication date: July 22, 2010Inventors: Changchun Bao, Hao Xu, Fanrong Tang, Xiangyu Hu
-
Publication number: 20100185441Abstract: A method of updating a state of a decoder that decodes successive portions of a data stream representing an encoded voice signal in dependence on its state, the method comprising: at the decoder, decoding portions of the data stream to form decoded portions; storing the decoded portions; storing respective decoder states held by the decoder after forming each decoded portion; identifying that a portion of the data stream is degraded; estimating a pitch period of a stored decoded portion formed by decoding a portion of the data stream that precedes the degraded portion of the data stream; selecting a stored decoder state held by the decoder after decoding a portion of the data stream that precedes the degraded portion by a multiple of the estimated pitch period; and updating the state of the decoder with the selected decoder state.Type: ApplicationFiled: January 23, 2009Publication date: July 22, 2010Applicant: CAMBRIDGE SILICON RADIO LIMITEDInventors: Xuejing Sun, Kuan-Chieh Yen
-
Publication number: 20100174534Abstract: A method of encoding speech, the method comprising: receiving a signal representative of speech to be encoded; at each of a plurality of intervals during the encoding, determining a pitch lag between portions of the signal having a degree of repetition; selecting for a set of said intervals a pitch lag vector from a pitch lag codebook of such vectors, each pitch lag vector comprising a set of offsets corresponding to the offset between the pitch lag determined for each said interval and an average pitch lag for said set of intervals, and transmitting an indication of the selected vector and said average over a transmission medium as part of the encoded signal representative of said speech.Type: ApplicationFiled: June 5, 2009Publication date: July 8, 2010Inventor: Koen Bernard Vos
-
Publication number: 20100174536Abstract: A pitch search method and device for digitally encoding a wideband signal, in particular but not exclusively a speech signal, in view of transmitting, or storing, and synthesizing this wideband sound signal. The new method and device which achieve efficient modeling of the harmonic structure of the speech spectrum uses several forms of low pass filters applied to a pitch codevector, the one yielding higher prediction gain (i.e. the lowest pitch prediction error) is selected and the associated pitch codebook parameters are forwarded.Type: ApplicationFiled: November 17, 2009Publication date: July 8, 2010Inventors: Bruno Bessette, Redwan Salami, Roch Lefebvre
-
Publication number: 20100169084Abstract: The present invention relates to a method and apparatus for pitch search. One method includes: obtaining a characteristic function value of a residual signal, where the residual signal is a result of removing a Long-Term Prediction (LTP) contribution signal from input speech signals; and obtaining a pitch according to the characteristic function value of the residual signal.Type: ApplicationFiled: December 23, 2009Publication date: July 1, 2010Applicant: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Dejun Zhang, Jianfeng Xu, Lei Miao, Fengyan Qi, Qing Zhang, Herve Marcel Taddei, Lixiong Li, Fuwei Ma, Yang Gao
-
Publication number: 20100161323Abstract: Provided is an audio encoding device capable of preventing audio quality degradation of a decoded signal. In the audio encoding device, a noise analysis unit (118) analyzes a noise characteristic of a higher range of an input spectrum. A filter coefficient decision unit (119) decides a filter coefficient in accordance with the noise characteristic information from the noise characteristic analysis unit (118). A filtering unit (113) includes a multi-tap pitch filter for filtering a first-layer decoded spectrum according to a filter state set by a filter state setting unit (112), a pitch coefficient outputted from a pitch coefficient setting unit (115), and a filter coefficient outputted from the filter coefficient decision unit (119), and calculates an estimated spectrum of the input spectrum. An optimal pitch coefficient can be decided by the process of a closed loop formed by the filter unit (113), a search unit (114), and the pitch coefficient setting unit (115).Type: ApplicationFiled: April 26, 2007Publication date: June 24, 2010Applicant: PANASONIC CORPORATIONInventor: Masahiro Oshikiri
-
Publication number: 20100145688Abstract: An apparatus and a method to encode and decode a speech signal using an encoding mode are provided. An encoding apparatus may select an encoding mode of a frame included in an input speech signal, and encode a frame having an unvoiced mode for an unvoiced speech as the selected encoding mode.Type: ApplicationFiled: December 4, 2009Publication date: June 10, 2010Applicant: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ho Sang Sung, Ki Hyun Choo, Jung Hoe Kim, Eun Mi Oh
-
Publication number: 20100125452Abstract: A method of refining a pitch period estimation of a signal, the method comprising: for each of a plurality of portions of the signal, scanning over a predefined range of time offsets to find an estimate of the pitch period of the portion within the predefined range of time offsets; identifying the average pitch period of the estimated pitch periods of the portions; determining a refined range of time offsets in dependence on the average pitch period, the refined range of time offsets being narrower than the predefined range of time offsets; and for a subsequent portion of the signal, scanning over the refined range of time offsets to find an estimate of the pitch period of the subsequent portion.Type: ApplicationFiled: November 19, 2008Publication date: May 20, 2010Applicant: Cambridge Silicon Radio LimitedInventor: Xuejing Sun
-
Publication number: 20100106489Abstract: Method and processing system for establishing the impact of time response distortion of an input signal which is applied to an audio transmission system (10) having an input and an output. A processor (11) is connected to the audio transmission system (10) for receiving the input signal (X(t)) and the output signal (Y(t)), and the processor (11) is arranged for outputting a time response degradation impact quality score. The processor (11) executes preprocessing of the input signal (X(t)) and output signal (Y(t)) to obtain pitch power densities (PPX(f)n, PPY(f)n) comprising pitch power density values for cells in the frequency (f) and time (n) domain, calculating a pitch power ratio function (PPR(f)n) of the pitch power densities for each cell, and determining a on time response distortion quality score (MOSTD) indicative of the transmission quality of the system (10) from the pitch power ratio function (PPR(f)n).Type: ApplicationFiled: March 28, 2008Publication date: April 29, 2010Applicant: KONINKLIJKE KPN N.V.Inventors: John Gerard Bereends, Jeroen Martijn Van Vugt, Menno Bangma, Omar Aziz Niamut, Bartosz Busz
-
Publication number: 20100094620Abstract: First encoded voice bits are transcoded into second encoded voice bits by dividing the first encoded voice bits into one or more received frames, with each received frame containing multiple ones of the first encoded voice bits. First parameter bits for at least one of the received frames are generated by applying error control decoding to one or more of the encoded voice bits contained in the received frame, speech parameters are computed from the first parameter bits, and the speech parameters are quantized to produce second parameter bits. Finally, a transmission frame is formed by applying error control encoding to one or more of the second parameter bits, and the transmission frame is included in the second encoded voice bits.Type: ApplicationFiled: December 14, 2009Publication date: April 15, 2010Applicant: DIGITAL VOICE SYSTEMS, INC.Inventor: John C. Hardwick
-
Publication number: 20100088089Abstract: Synthesizing a set of digital speech samples corresponding to a selected voicing state includes dividing speech model parameters into frames, with a frame of speech model parameters including pitch information, voicing information determining the voicing state in one or more frequency regions, and spectral information. First and second digital filters are computed using, respectively, first and second frames of speech model parameters, with the frequency responses of the digital filters corresponding to the spectral information in frequency regions for which the voicing state equals the selected voicing state. A set of pulse locations are determined, and sets of first and second signal samples are produced using the pulse locations and, respectively, the first and second digital filters. Finally, the sets of first and second signal samples are combined to produce a set of digital speech samples corresponding to the selected voicing state.Type: ApplicationFiled: August 21, 2009Publication date: April 8, 2010Applicant: DIGITAL VOICE SYSTEMS, INC.Inventor: John C. Hardwick
-
Publication number: 20100070270Abstract: In one embodiment, a method of receiving a decoded audio signal that has a transmitted pitch lag is disclosed. The method includes estimating pitch correlations of possible short pitch lags that are smaller than a minimum pitch limitation and have an approximated multiple relationship with the transmitted pitch lag, checking if one of the pitch correlations of the possible short pitch lags is large enough compared to a pitch correlation estimated with the transmitted pitch lag, and selecting a short pitch lag as a corrected pitch lag if a corresponding pitch correlation is large enough. The postprocessing is performed using the corrected pitch lag. In another embodiment, when the existence of irregular harmonics or wrong pitch lag is detected, a coded-excited linear prediction (CELP) postfilter is made more aggressive.Type: ApplicationFiled: September 15, 2009Publication date: March 18, 2010Applicant: GH INNOVATION, INC.Inventor: Yang Gao
-
Publication number: 20100070269Abstract: In an embodiment, a method of transmitting an input audio signal is disclosed. A first coding error of the input audio signal with a scalable codec having a first enhancement layer is encoded, and a second coding error is encoded using a second enhancement layer after the first enhancement layer. Encoding the second coding error includes coding fine spectrum coefficients of the second coding error to produce coded fine spectrum coefficients, and coding a spectral envelope of the second coding error to produce a coded spectral envelope. The coded fine spectrum coefficients and the coded spectral envelope are transmitted.Type: ApplicationFiled: September 15, 2009Publication date: March 18, 2010Applicant: Huawei Technologies Co., Ltd.Inventor: Yang Gao
-
Publication number: 20100063804Abstract: Provided is an adaptive sound source vector quantization device which can always perform a pitch cycle search with a resolution appropriate for any section of the pitch cycle search range of a second sub-frame when a pitch cycle search range of the second sub-frame changes in accordance with a pitch cycle of a first sub-frame. The device includes a first pitch cycle instruction unit (111), a search range calculation unit (112), and a second pitch cycle instruction unit (113). The first pitch cycle instruction unit (111) successively instructs pitch cycle search candidates in a predetermined search range having a search resolution which transits over a predetermined pitch cycle candidate for the first sub-frame. The search range calculation unit (112) calculates a predetermined range before and after the pitch cycle of the first sub-frame as the pitch cycle search range for the second sub-frame, if the predetermined range includes the predetermined pitch cycle search candidate.Type: ApplicationFiled: February 29, 2008Publication date: March 11, 2010Applicant: PANASONIC CORPORATIONInventors: Kaoru Sato, Toshiyuki Morii
-
Publication number: 20100063806Abstract: Low bit rate audio coding such as BWE algorithm often encounters conflict goal of achieving high time resolution and high frequency resolution at the same time. In order to achieve best possible quality, input signal can be first classified into fast signal and slow signal. This invention focuses on classifying signal into fast signal and slow signal, based on at least one of the following parameters or a combination of the following parameters: spectral sharpness, temporal sharpness, pitch correlation (pitch gain), and/or spectral envelope variation. This classification information can help to choose different BWE algorithms, different coding algorithms, and different postprocessing algorithms respectively for fast signal and slow signal.Type: ApplicationFiled: September 4, 2009Publication date: March 11, 2010Inventor: Yang Gao
-
Publication number: 20100063805Abstract: A decoder arrangement comprising a receiver input for parameters of frame-based coded signals and a decoder arranged to provide frames of decoded audio signals based on the parameters. The receiver input and/or the decoder is arranged to establish a time difference between the occasion when parameters of a first frame is available at the receiver input and the occasion when a decoded audio signal of the first frame is available at an output of the decoder, which time difference corresponds to at least one frame. A postfilter is connected to the output of the decoder and to the receiver input. The postfilter is arranged to provide a filtering of the frames of decoded audio signals into an output signal in response to parameters of a respective subsequent frame.Type: ApplicationFiled: December 14, 2007Publication date: March 11, 2010Inventor: Stefan Bruhn
-
Publication number: 20100049506Abstract: A method, device and system to implement hiding the loss packet are provided. The provided method, device and system recover the lost frame according to the data before and after the lost frame and enhances the correlation of the recovered lost frame data and the data after the lost frame. A method and device for estimating pitch period are also provided which select a pitch period from the initial pitch period and the pitch periods corresponding to the frequencies which are one or more times higher than the frequencies corresponding to the initial pitch period as the final estimated pitch period, may improve frequency multiplication when estimating the pitch period; in addition, by tuning of the pitch period by matching the waves, the error of estimating pitch period may be reduced and the quality of the audio data may be improved.Type: ApplicationFiled: November 2, 2009Publication date: February 25, 2010Inventors: Wuzhou Zhan, Dongqi Wang
-
Publication number: 20100049505Abstract: A method, device and system to implement hiding the loss packet are provided. The provided method, device and system recover the lost frame according to the data before and after the lost frame and enhances the correlation of the recovered lost frame data and the data after the lost frame. A method and device for estimating pitch period are also provided which select a pitch period from the initial pitch period and the pitch periods corresponding to the frequencies which are one or more times higher than the frequencies corresponding to the initial pitch period as the final estimated pitch period, may improve frequency multiplication when estimating the pitch period; in addition, by tuning of the pitch period by matching the waves, the error of estimating pitch period may be reduced and the quality of the audio data may be improved.Type: ApplicationFiled: November 2, 2009Publication date: February 25, 2010Inventors: Wuzhou ZHAN, Dongqi Wang
-
Publication number: 20100036657Abstract: The speech estimation system of the present invention includes a transmitter (2) for transmitting a test signal, a receiver (3) for receiving the test signal, and a speech estimation unit (4) for estimating speech from a received signal. Transmitter (2) transmits the test signal toward speech organs, receiver (3) receives the test signal that has been reflected by the speech organs, and speech estimation unit (4) estimates speech or speech waveforms based on the waveform of the reflection wave of the test signal that was received by the receiver (3).Type: ApplicationFiled: November 20, 2007Publication date: February 11, 2010Inventors: Mitsunori Morisaki, Kenichi Ishii
-
Publication number: 20100017200Abstract: Disclosed is an encoding device which can accurately specify a band having a large error among all the bands by using a small calculation amount. The device includes: a first position identification unit (201) which uses a first layer error conversion coefficient indicating an error of decoding signal for an input signal so as to search for a band having a large error in a relatively wide bandwidth in all the bands of the input signal and generates first position information indicating the identified band; a second position identification unit (202) which searches for a target frequency band having a large error in a relatively narrow bandwidth in the band identified by the first position identification unit (201) and generates second position information indicating the identified target frequency band; and an encoding unit (203) which encodes a first layer decoding error conversion coefficient contained in the target frequency band.Type: ApplicationFiled: February 29, 2008Publication date: January 21, 2010Applicant: PANASONIC CORPORATIONInventors: Masahiro Oshikiri, Tomofumi Yamanashi, Toshiyuki Morii
-
Publication number: 20100014840Abstract: An information processing apparatus includes a viewer information input unit that receives input of information about a viewer viewing reproduced program content as viewer information by watching video displayed on a monitor or listening to a voice output from a speaker, an upsurge degree acquisition unit that acquires a degree of upsurge of the viewer based on the viewer information whose input is received by the viewer information input unit, and a highlight extraction unit that extracts highlights of the program content based on the degree of upsurge acquired by the upsurge degree acquisition unit.Type: ApplicationFiled: June 30, 2009Publication date: January 21, 2010Applicant: Sony CorporationInventor: Tsutomu Nagai
-
Publication number: 20100010810Abstract: When a decoding audio signal is to be acquired by pitch-filtering a combined signal of a sub-frame length, a decoding audio signal is continuously changed at the boundary between sub-frames. The post filter includes: a first filter coefficient calculation unit (306) which obtains a pitch filter coefficient gP(0) of a current frame so as to asymptotically approach the intensity g of the pitch filter from an initial value 0; a second filter coefficient calculation unit (307) which obtains a pitch filter coefficient gP(?1) of a preceding frame so as to asymptotically approach 0 by setting the initial value to the value of the pitch filter coefficient obtained by the first filter coefficient calculation unit (306); a filter state setting unit (308) which sets a pitch filter state fsi for each of the sub-frames; and a pitch filter (309) which pitch-filters the combined signal xi by using the pitch filter coefficients gP(?1), gP(0), and past demodulation audio signals yi?P(?1), yi?P(0).Type: ApplicationFiled: December 13, 2007Publication date: January 14, 2010Applicant: PANASONIC CORPORATIONInventor: Toshiyuki Morii
-
Publication number: 20090326951Abstract: Ratios of powers at the peaks of respective formants of the spectrum of a pitch-cycle waveform and powers at boundaries between the formants are obtained and, when the ratios are large, bandwidth of window functions are widened and the formant waveforms are generated by multiplying generated sinusoidal waveforms from the formant parameter sets on the basis of pitch-cycle waveform generating data by the window functions of the widened bandwidth, whereby a pitch-cycle waveform is generated by the sum of these formant waveforms.Type: ApplicationFiled: April 14, 2009Publication date: December 31, 2009Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Ryo Morinaka, Takehiko Kagoshima
-
Publication number: 20090326930Abstract: Provided is an audio decoding device capable of suppressing an information amount for a lost frame compensation process and encoding efficiency. In this device, a decoded sound source generation unit (203) generates a lost frame decoded sound source signal; a pitch pulse information decoding unit (204) decodes the pitch pulse position information and the pitch pulse amplitude information; a pitch pulse waveform learning unit (205) learns a pitch pulse learning waveform in the past frame in advance from the lost frame; a convolution unit (206) amplitude-adjusts the pitch pulse learning waveform according to the pitch pulse amplitude information, and convolutes the pitch pulse waveform into a time axis which has been amplitude-adjusted according to the pitch pulse position information; a sound source signal correction unit (207) adds or replaces the pitch pulse waveform convoluted into the time axis to the lost frame decoded sound source signal.Type: ApplicationFiled: July 11, 2007Publication date: December 31, 2009Applicant: PANASONIC CORPORATIONInventors: Takuya Kawashima, Hiroyuki Ehara, Koji Yoshida
-
Publication number: 20090299736Abstract: To provide a speech coding technology that realizes a low bit rate and can suppress distortion of reproduction speech as compared with a conventional technology. There are provided pitch detecting means 5 that detects a pitch frequency of an input speech signal, residual calculating means 6 that calculates the difference (residual frequency) between the pitch frequency and a reference frequency, a frequency shifter 4 that shifts a frequency of the input speech signal in proportional to the residual frequency in a direction for being close to the reference frequency and equalizes a pitch period, and orthogonal transforming means that orthogonally transforms the speech signal (pitch-equalizing speech signal) output by the frequency shifter 4 by a constant number of the pitch intervals and generates transforming coefficient data, and waveform coding means that encodes the transforming coefficient data.Type: ApplicationFiled: March 24, 2006Publication date: December 3, 2009Applicant: KYUSHU INSTITUTE OF TECHNOLOGYInventor: Yasushi Sato
-
Publication number: 20090281797Abstract: A bit error concealment (BEC) system and method is described herein that detects and conceals the presence of click-like artifacts in an audio signal caused by bit errors introduced during transmission of the audio signal within an audio communications system. A particular embodiment of the present invention utilizes a low-complexity design that introduces no added delay and that is particularly well-suited for applications such as Bluetooth® wireless audio devices which have low cost and low power dissipation requirements.Type: ApplicationFiled: April 28, 2009Publication date: November 12, 2009Applicant: BROADCOM CORPORATIONInventors: Robert W. Zopf, Vivek Kumar, Juin-Hwey Chen
-
Publication number: 20090240490Abstract: A method and apparatus for concealing frame loss and an apparatus for transmitting and receiving a speech signal that are capable of reducing speech quality degradation caused by packet loss are provided. In the method, when loss of a current received frame occurs, a random excitation signal having the highest correlation with a periodic excitation signal (i.e., a pitch excitation signal) decoded from a previous frame received without loss is used as a noise excitation signal to recover an excitation signal of a current lost frame. Furthermore, a third, new attenuation constant (AS) is obtained by summing a first attenuation constant (NS) obtained based on the number of continuously lost frames and a second attenuation constant (PS) predicted in consideration of change in amplitude of previously received frames to adjust the amplitude of the recovered excitation signal for the current lost frame.Type: ApplicationFiled: January 9, 2009Publication date: September 24, 2009Applicant: Gwangju Institute of Science and TechnologyInventors: Hong Kook Kim, Choong Sang Cho
-
Publication number: 20090222260Abstract: A method and system for multi-channel detection of pitch may comprise one or more of the following steps and/or means therefore: (a) sampling an audio input stream including at least a first channel and a second channel; (b) setting a search frequency for each of the first channel and the second channel; and (c) detecting a pitch of the first channel and a pitch of the second channel.Type: ApplicationFiled: March 2, 2009Publication date: September 3, 2009Inventor: David W. Petr
-
Publication number: 20090222259Abstract: A feature extraction apparatus includes a spectrum calculating unit that calculates, based on an input speech signal, a frequency spectrum having frequency components obtained at regular intervals on a logarithmic frequency scale for each of frames that are defined by regular time intervals, and thereby generates a time series of the frequency spectrum; a cross-correlation coefficients calculating unit that calculates, for each target frame of the frames, a cross-correlation coefficients between frequency spectra calculated for two different frames that are in vicinity of the target frame and a predetermined frame width apart from each other; and a shift amount predicting unit that predicts a shift amount of the frequency spectra on the logarithmic frequency scale with respect to the predetermined frame width by use of the cross-correlation coefficients.Type: ApplicationFiled: February 5, 2009Publication date: September 3, 2009Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Yusuke Kida, Takashi Masuko
-
Publication number: 20090210220Abstract: A speech analyzer includes a speech acquiring section, a frequency converting section, an autocorrelation section, and a pitch detection section. The frequency converting section converts the speech signal acquired by the speech acquiring section into a frequency spectrum. The autocorrelation section determines an autocorrelation waveform by shifting the frequency spectrum along the frequency axis. The pitch detection section determines the pitch frequency from the distance between two local crests or troughs of the autocorrelation waveform.Type: ApplicationFiled: June 2, 2006Publication date: August 20, 2009Inventors: Shunji Mitsuyoshi, Kaoru Ogata, Fumiaki Monma
-
Publication number: 20090204396Abstract: The present disclosure relates to a decoding method and apparatus. The method includes: receiving data frames from the coder; if any erroneous frame appears, calculating a pitch lag parameter of the erroneous frame; decoding the data frames according to the calculated pitch lag parameter of the erroneous frame, and obtaining decoded data. The process of determining the pitch lag parameter includes: determining the number of continuous erroneous frames and the pitch lag parameter of the previous frame; adjusting the pitch lag parameter of the previous frame according to the number of the continuous erroneous frames and a preset adjustment policy, and calculating and determining the pitch lag parameter of a current erroneous frame, wherein the preset adjustment policy is adjusting the determined pitch lag parameter of the current erroneous frame within a preset value range according to the number of the continuous erroneous frames.Type: ApplicationFiled: April 20, 2009Publication date: August 13, 2009Inventors: Jianfeng Xu, Lijing Xu, Qing Zhang, Wei Li, Shenghu Sang, Zhengzhong Du, Chen Hu
-
Publication number: 20090192789Abstract: Provided are a method and apparatus for effectively encoding/decoding remaining difference signals excluding sinusoidal components, from input audio signals. In the method and apparatus for encoding audio signals, sinusoidal analysis is performed on low frequency signals of less than a predetermined critical frequency in order to extract sinusoidal signals and then, an encoding operation is performed on the remaining difference signals excluding the sinusoidal signals, from input audio signals, by using linear prediction coding (LPC) analysis.Type: ApplicationFiled: January 29, 2009Publication date: July 30, 2009Applicant: Samsung Electronics Co., Ltd.Inventors: Geon-hyoung LEE, Chul-woo Lee, Jong-hoon Jeong, Nam-suk Lee, Han-gil Moon
-
Publication number: 20090192788Abstract: In a sound processing device, a modulation spectrum specifier specifies a modulation spectrum of an input sound for each of a plurality of unit intervals. An index calculator calculates an index value corresponding to a magnitude of components of modulation frequencies belonging to a predetermined range of the modulation spectrum. A determinator determines whether the input sound of each of the unit intervals is a vocal sound or a non-vocal sound based on the index value.Type: ApplicationFiled: January 23, 2009Publication date: July 30, 2009Applicant: Yamaha CorporationInventor: Yasuo YOSHIOKA
-
Publication number: 20090182556Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, a method of processing a signal representing speech can comprise receiving a frame of the signal representing speech, classifying the frame as a voiced frame, and parsing the voiced frame into one or more regions based on occurrence of one or more events within the voiced frame. For example, the one or more events can comprise one or more glottal pulses. The one or more regions may collectively represent less than all of the voiced frame.Type: ApplicationFiled: October 23, 2008Publication date: July 16, 2009Applicant: Red Shift Company, LLCInventors: Erik N. Reckase, John F. Remillard
-
Publication number: 20090177464Abstract: A speech encoder that analyzes and classifies each frame of speech as being periodic-like speech or non-periodic like speech where the speech encoder performs a different gain quantization process depending if the speech is periodic or not. If the speech is periodic, the improved speech encoder obtains the pitch gains from the unquantized weighted speech signal and performs a pre-vector quantization of the adaptive codebook gain GP for each subframe of the frame before subframe processing begins and a closed-loop delayed decision vector quantization of the fixed codebook gain GC. If the frame of speech is non-periodic, the speech encoder may use any known method of gain quantization.Type: ApplicationFiled: March 6, 2009Publication date: July 9, 2009Applicants: MINDSPEED TECHNOLOGIES, INC.Inventors: Yang Gao, Adil Benyassine
-
Publication number: 20090171656Abstract: The invention concerns a method and apparatus for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder that does not have a built-in or standard FEC process. A receiver with a decoder receives encoded frames of compressed speech information transmitted from an encoder. A lost frame detector at the receiver determines if an encoded frame has been lost or corrupted in transmission, or erased. If the encoded frame is not erased, the encoded frame is decoded by a decoder and a temporary memory is updated with the decoder's output. A predetermined delay period is applied and the audio frame is then output. If the lost frame detector determines that the encoded frame is erased, a FEC module applies a frame concealment process to the signal. The FEC processing produces natural sounding synthetic speech for the erased frames.Type: ApplicationFiled: February 20, 2009Publication date: July 2, 2009Inventor: David A. Kapilow
-
Publication number: 20090157395Abstract: In accordance with one aspect of the invention, a selector supports the selection of a first encoding scheme or the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal. The first encoding scheme has a pitch pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward an ideal voiced and stationary characteristic. The pre-processing procedure allows the encoder to fully capture the benefits of a bandwidth-efficient, long-term predictive procedure for a greater amount of speech components of an input speech signal than would otherwise be possible. In accordance with another aspect of the invention, the second encoding scheme entails a long-term prediction mode for encoding the pitch on a sub-frame by sub-frame basis.Type: ApplicationFiled: January 26, 2009Publication date: June 18, 2009Applicants: MINSPEED TECHNOLOGIES, INC.Inventors: Huan-Yu Su, Yang Gao
-
Publication number: 20090125298Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.Type: ApplicationFiled: November 3, 2008Publication date: May 14, 2009Applicant: Melodis Inc.Inventors: Aaron Master, Seyed Majid Emami
-
Publication number: 20090125300Abstract: A scalable encoding apparatus capable of reducing the bit rates of encoded parameters and also capable of efficiently encoding even audio signals in which a plurality of harmonic structures are coexistent. In the apparatus, an MDCT analyzing part (111) MDCT analyzes an audio signal (S15) for converting/encoding processes. A pitch frequency converting part (112) determines the inverse of a pitch period to calculate a pitch frequency. A selecting part (113) selects spectra located at frequencies that are integral multiples of the pitch frequency. A second layer encoding part (106) encodes the selected spectra.Type: ApplicationFiled: October 26, 2005Publication date: May 14, 2009Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.Inventor: Masahiro Oshikiri
-
Publication number: 20090119096Abstract: A system enhances the quality of a digital speech signal that may include noise. The system identifies vocal expressions that correspond to the digital speech signal. A signal-to-noise ratio of the digital speech signal is measured before a portion of the digital speech signal is synthesized. The selected portion of the digital speech signal may have a signal-to-noise ratio below a predetermined level and the synthesis of the digital speech signal may be based on speaker identification.Type: ApplicationFiled: October 20, 2008Publication date: May 7, 2009Inventors: Franz Gerl, Tobias Herbig, Mohamed Krini, Gerhard Uwe Schmidt
-
Publication number: 20090119097Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.Type: ApplicationFiled: November 3, 2008Publication date: May 7, 2009Applicant: Melodis Inc.Inventors: Aaron Master, Seyed Majid Emami
-
Publication number: 20090076805Abstract: The present invention discloses a method for performing a frame erasure concealment to a higher-band signal, including: calculating a periodic intensity of a higher-band signal with respect to a lower-band signal; judging whether the periodic intensity of the higher-band signal is higher than or equal to a preconfigured threshold; if the periodic intensity of the higher-band signal is higher than or equal to the preconfigured threshold, using a pitch period repetition method to perform the frame erasure concealment to the higher-band signal of a current lost frame; and if the periodic intensity of the higher-band signal is lower than the preconfigured threshold, using a previous frame data repetition method to perform the frame erasure concealment to the higher-band signal of the current lost frame. The present invention further discloses a device for performing a frame erasure concealment to a higher-band signal and a speech decoder. The problem that the quality of the voice signal is lowered is avoided.Type: ApplicationFiled: May 29, 2008Publication date: March 19, 2009Applicant: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Jianfeng Xu, Lei Miao, Chen Hu, Qing Zhang, Lijing Xu, Wei Li, Zhengzhong Du, Yi Yang, Fengyan Qi, Wuzhou Zhan, Dongqi Wang