Pitch Determination Of Speech Signals (epo) Patents (Class 704/E11.006)
  • Publication number: 20110054886
    Abstract: An effect device may be configured such that when an input audio signal switches from a consonant to a vowel and an input level of the switched vowel is greater than a threshold value Lc (and a variable t is greater than time Ts), an audio effect signal A may be generated. Such an effect device may allow for increasing the occurrences when portamento is simulated, while still sounding natural. In general, a detecting module detects whether an audio signal is a vowel sound or a consonant sound and whether the audio signal changed from a consonant sound to a vowel sound; and a pitch change module changes a pitch of the audio signal and changes, based on a prescribed function, an amount the pitch is changed to produce a modified audio signal, when the audio signal changed from a consonant sound to a vowel sound.
    Type: Application
    Filed: August 30, 2010
    Publication date: March 3, 2011
    Inventor: Takahiro Ae
  • Publication number: 20110004467
    Abstract: Systems, methods, and computer program products are provided for producing audio and/or visual effects according to a correlation between reference data and estimated note data derived from an input acoustic audio waveform. Some embodiments calculate a pitch score as a function of a pitch estimate derived from the input waveform, a reference pitch, and a real-time-adjustable pitch gating window. Other embodiments calculate the pitch score as a function of pitch and timing estimates derived from the input waveform, reference pitch and note timing data, an adjustable rhythm gating window, and an adjustable pitch gating window. The audio and/or visual effects are produced according to the pitch score, and may be used to generate outputs (e.g., in real time) for affecting a live performance, an audio mix, a video gaming environment, an educational feedback environment, etc.
    Type: Application
    Filed: June 30, 2010
    Publication date: January 6, 2011
    Applicant: MuseAmi, Inc.
    Inventors: Robert D. Taub, J. Alexander Cabanilla, Jonathan Sheldrick, George Tourtellot
  • Publication number: 20100305944
    Abstract: A method of estimating a pitch period of a first portion of a signal wherein the first portion overlaps a previous portion. The method comprises computing a first autocorrelation value for part of the first portion not overlapping the previous portion. The method further comprises retrieving a stored second autocorrelation value for part of the first portion overlapping the previous portion, the second autocorrelation value having been computed during estimation of a pitch period of the previous portion. The method further comprises forming a combined autocorrelation value using the first and second autocorrelation values, and selecting the estimated pitch period in dependence on the combined autocorrelation value.
    Type: Application
    Filed: May 28, 2009
    Publication date: December 2, 2010
    Applicant: Cambridge Silicon Radio Limited
    Inventor: Xuejing Sun
  • Publication number: 20100241424
    Abstract: There is provided a speech encoder for performing an algorithm that comprises obtaining (205) a plurality of open-loop pitch candidates from a current frame of a speech signal, the plurality of open-loop pitch candidates including a first open-loop pitch candidate and a second open-loop pitch candidate; obtaining (205) a voicing information from one or more previous frames; and selecting (280) one of the plurality of open-loop pitch candidates as a final pitch of the current frame using the voicing information from the one or more previous frames. In one aspect, the voicing information from the one or more previous frames includes a previous pitch of the one or more previous frames. In a further aspect, selecting the final pitch of the current frame includes selecting (210) an initial open-loop pitch from that has the maximum long-term correlation value.
    Type: Application
    Filed: October 27, 2006
    Publication date: September 23, 2010
    Applicant: MINDSPEED TECHNOLOGIES, INC.
    Inventor: Yang Gao
  • Publication number: 20100241423
    Abstract: Embodiments of a system and method for encoding audio data have been described. In one embodiment, the method includes transforming frequency domain data in a plurality of signal windows of an audio dataset from a cosine/sine format to a magnitude/cosine/sine format. The magnitude/cosine/sine format disproportionately represents a magnitude of the frequency domain data over a phase of the frequency domain data. The above transformation may be a pre-processing stage of vector quantization usable to produce a codebook.
    Type: Application
    Filed: March 18, 2009
    Publication date: September 23, 2010
    Inventors: Stanley Wayne Jackson, Jay T. Dresser
  • Publication number: 20100235166
    Abstract: A method of audio processing comprises composing one or more transformation profiles for transforming audio characteristics of an audio recording and then generating for the or each transformation profile, a metadata set comprising transformation profile data and location data indicative of where in the recording the transformation profile data is to be applied; the or each metadata set is then stored in association with the corresponding recording. A corresponding method of audio reproduction comprises reading a recording and a meta-data set associated with that recording from storage, applying transformations to the recording data in accordance with the metadata set transformation profile; and then outputting the transformed recording.
    Type: Application
    Filed: October 17, 2007
    Publication date: September 16, 2010
    Applicant: SONY COMPUTER ENTERTAINMENT EUROPE LIMITED
    Inventors: Daniele Giuseppe Bardino, Richard James Griffiths
  • Publication number: 20100211384
    Abstract: A pitch detection method and apparatus are disclosed. The method includes: performing pitch detection on an input signal in a signal domain, and obtaining a candidate pitch; performing linear prediction (LP) on the input signal, and obtaining an LP residual signal; setting a candidate pitch range that includes the candidate pitch; searching the candidate pitch range for the LP residual signal, and obtaining a selected pitch.
    Type: Application
    Filed: April 9, 2010
    Publication date: August 19, 2010
    Inventors: Fengyan Qi, Dejun Zhang, Lei Miao, Jianfeng Xu, Herve Marcel Taddei, Qing Zhang, Yang Gao
  • Publication number: 20100191524
    Abstract: A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value.
    Type: Application
    Filed: April 5, 2010
    Publication date: July 29, 2010
    Applicant: FUJITSU LIMITED
    Inventors: Nobuyuki Washio, Shoji Hayakawa
  • Publication number: 20100185440
    Abstract: The embodiments of a transcoding method, a transcoding device, and a communication apparatus are provided. The embodiment of a method includes: receiving a bit stream input from a sending end; determining an attribute of discontinuous transmission (DTX) used by a receiving end and a frame type of the input bit stream; and transcoding the input bit stream in a corresponding processing manner according to a determination result. Thereby, a corresponding transcoding operation is performed on the input bit stream according to the attribute of DTX used by the receiving end and the frame type of the input bit stream. In such a manner, input bit streams of various types can be processed, and the input bit streams can be correspondingly transcoded according to the requirements of the receiving end. Therefore, the average computational complexity and peak computational complexity can be effectively decreased without decreasing the quality of the synthesized speech.
    Type: Application
    Filed: January 21, 2010
    Publication date: July 22, 2010
    Inventors: Changchun Bao, Hao Xu, Fanrong Tang, Xiangyu Hu
  • Publication number: 20100185441
    Abstract: A method of updating a state of a decoder that decodes successive portions of a data stream representing an encoded voice signal in dependence on its state, the method comprising: at the decoder, decoding portions of the data stream to form decoded portions; storing the decoded portions; storing respective decoder states held by the decoder after forming each decoded portion; identifying that a portion of the data stream is degraded; estimating a pitch period of a stored decoded portion formed by decoding a portion of the data stream that precedes the degraded portion of the data stream; selecting a stored decoder state held by the decoder after decoding a portion of the data stream that precedes the degraded portion by a multiple of the estimated pitch period; and updating the state of the decoder with the selected decoder state.
    Type: Application
    Filed: January 23, 2009
    Publication date: July 22, 2010
    Applicant: CAMBRIDGE SILICON RADIO LIMITED
    Inventors: Xuejing Sun, Kuan-Chieh Yen
  • Publication number: 20100174534
    Abstract: A method of encoding speech, the method comprising: receiving a signal representative of speech to be encoded; at each of a plurality of intervals during the encoding, determining a pitch lag between portions of the signal having a degree of repetition; selecting for a set of said intervals a pitch lag vector from a pitch lag codebook of such vectors, each pitch lag vector comprising a set of offsets corresponding to the offset between the pitch lag determined for each said interval and an average pitch lag for said set of intervals, and transmitting an indication of the selected vector and said average over a transmission medium as part of the encoded signal representative of said speech.
    Type: Application
    Filed: June 5, 2009
    Publication date: July 8, 2010
    Inventor: Koen Bernard Vos
  • Publication number: 20100174536
    Abstract: A pitch search method and device for digitally encoding a wideband signal, in particular but not exclusively a speech signal, in view of transmitting, or storing, and synthesizing this wideband sound signal. The new method and device which achieve efficient modeling of the harmonic structure of the speech spectrum uses several forms of low pass filters applied to a pitch codevector, the one yielding higher prediction gain (i.e. the lowest pitch prediction error) is selected and the associated pitch codebook parameters are forwarded.
    Type: Application
    Filed: November 17, 2009
    Publication date: July 8, 2010
    Inventors: Bruno Bessette, Redwan Salami, Roch Lefebvre
  • Publication number: 20100169084
    Abstract: The present invention relates to a method and apparatus for pitch search. One method includes: obtaining a characteristic function value of a residual signal, where the residual signal is a result of removing a Long-Term Prediction (LTP) contribution signal from input speech signals; and obtaining a pitch according to the characteristic function value of the residual signal.
    Type: Application
    Filed: December 23, 2009
    Publication date: July 1, 2010
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Dejun Zhang, Jianfeng Xu, Lei Miao, Fengyan Qi, Qing Zhang, Herve Marcel Taddei, Lixiong Li, Fuwei Ma, Yang Gao
  • Publication number: 20100161323
    Abstract: Provided is an audio encoding device capable of preventing audio quality degradation of a decoded signal. In the audio encoding device, a noise analysis unit (118) analyzes a noise characteristic of a higher range of an input spectrum. A filter coefficient decision unit (119) decides a filter coefficient in accordance with the noise characteristic information from the noise characteristic analysis unit (118). A filtering unit (113) includes a multi-tap pitch filter for filtering a first-layer decoded spectrum according to a filter state set by a filter state setting unit (112), a pitch coefficient outputted from a pitch coefficient setting unit (115), and a filter coefficient outputted from the filter coefficient decision unit (119), and calculates an estimated spectrum of the input spectrum. An optimal pitch coefficient can be decided by the process of a closed loop formed by the filter unit (113), a search unit (114), and the pitch coefficient setting unit (115).
    Type: Application
    Filed: April 26, 2007
    Publication date: June 24, 2010
    Applicant: PANASONIC CORPORATION
    Inventor: Masahiro Oshikiri
  • Publication number: 20100145688
    Abstract: An apparatus and a method to encode and decode a speech signal using an encoding mode are provided. An encoding apparatus may select an encoding mode of a frame included in an input speech signal, and encode a frame having an unvoiced mode for an unvoiced speech as the selected encoding mode.
    Type: Application
    Filed: December 4, 2009
    Publication date: June 10, 2010
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ho Sang Sung, Ki Hyun Choo, Jung Hoe Kim, Eun Mi Oh
  • Publication number: 20100125452
    Abstract: A method of refining a pitch period estimation of a signal, the method comprising: for each of a plurality of portions of the signal, scanning over a predefined range of time offsets to find an estimate of the pitch period of the portion within the predefined range of time offsets; identifying the average pitch period of the estimated pitch periods of the portions; determining a refined range of time offsets in dependence on the average pitch period, the refined range of time offsets being narrower than the predefined range of time offsets; and for a subsequent portion of the signal, scanning over the refined range of time offsets to find an estimate of the pitch period of the subsequent portion.
    Type: Application
    Filed: November 19, 2008
    Publication date: May 20, 2010
    Applicant: Cambridge Silicon Radio Limited
    Inventor: Xuejing Sun
  • Publication number: 20100106489
    Abstract: Method and processing system for establishing the impact of time response distortion of an input signal which is applied to an audio transmission system (10) having an input and an output. A processor (11) is connected to the audio transmission system (10) for receiving the input signal (X(t)) and the output signal (Y(t)), and the processor (11) is arranged for outputting a time response degradation impact quality score. The processor (11) executes preprocessing of the input signal (X(t)) and output signal (Y(t)) to obtain pitch power densities (PPX(f)n, PPY(f)n) comprising pitch power density values for cells in the frequency (f) and time (n) domain, calculating a pitch power ratio function (PPR(f)n) of the pitch power densities for each cell, and determining a on time response distortion quality score (MOSTD) indicative of the transmission quality of the system (10) from the pitch power ratio function (PPR(f)n).
    Type: Application
    Filed: March 28, 2008
    Publication date: April 29, 2010
    Applicant: KONINKLIJKE KPN N.V.
    Inventors: John Gerard Bereends, Jeroen Martijn Van Vugt, Menno Bangma, Omar Aziz Niamut, Bartosz Busz
  • Publication number: 20100094620
    Abstract: First encoded voice bits are transcoded into second encoded voice bits by dividing the first encoded voice bits into one or more received frames, with each received frame containing multiple ones of the first encoded voice bits. First parameter bits for at least one of the received frames are generated by applying error control decoding to one or more of the encoded voice bits contained in the received frame, speech parameters are computed from the first parameter bits, and the speech parameters are quantized to produce second parameter bits. Finally, a transmission frame is formed by applying error control encoding to one or more of the second parameter bits, and the transmission frame is included in the second encoded voice bits.
    Type: Application
    Filed: December 14, 2009
    Publication date: April 15, 2010
    Applicant: DIGITAL VOICE SYSTEMS, INC.
    Inventor: John C. Hardwick
  • Publication number: 20100088089
    Abstract: Synthesizing a set of digital speech samples corresponding to a selected voicing state includes dividing speech model parameters into frames, with a frame of speech model parameters including pitch information, voicing information determining the voicing state in one or more frequency regions, and spectral information. First and second digital filters are computed using, respectively, first and second frames of speech model parameters, with the frequency responses of the digital filters corresponding to the spectral information in frequency regions for which the voicing state equals the selected voicing state. A set of pulse locations are determined, and sets of first and second signal samples are produced using the pulse locations and, respectively, the first and second digital filters. Finally, the sets of first and second signal samples are combined to produce a set of digital speech samples corresponding to the selected voicing state.
    Type: Application
    Filed: August 21, 2009
    Publication date: April 8, 2010
    Applicant: DIGITAL VOICE SYSTEMS, INC.
    Inventor: John C. Hardwick
  • Publication number: 20100070270
    Abstract: In one embodiment, a method of receiving a decoded audio signal that has a transmitted pitch lag is disclosed. The method includes estimating pitch correlations of possible short pitch lags that are smaller than a minimum pitch limitation and have an approximated multiple relationship with the transmitted pitch lag, checking if one of the pitch correlations of the possible short pitch lags is large enough compared to a pitch correlation estimated with the transmitted pitch lag, and selecting a short pitch lag as a corrected pitch lag if a corresponding pitch correlation is large enough. The postprocessing is performed using the corrected pitch lag. In another embodiment, when the existence of irregular harmonics or wrong pitch lag is detected, a coded-excited linear prediction (CELP) postfilter is made more aggressive.
    Type: Application
    Filed: September 15, 2009
    Publication date: March 18, 2010
    Applicant: GH INNOVATION, INC.
    Inventor: Yang Gao
  • Publication number: 20100070269
    Abstract: In an embodiment, a method of transmitting an input audio signal is disclosed. A first coding error of the input audio signal with a scalable codec having a first enhancement layer is encoded, and a second coding error is encoded using a second enhancement layer after the first enhancement layer. Encoding the second coding error includes coding fine spectrum coefficients of the second coding error to produce coded fine spectrum coefficients, and coding a spectral envelope of the second coding error to produce a coded spectral envelope. The coded fine spectrum coefficients and the coded spectral envelope are transmitted.
    Type: Application
    Filed: September 15, 2009
    Publication date: March 18, 2010
    Applicant: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Publication number: 20100063804
    Abstract: Provided is an adaptive sound source vector quantization device which can always perform a pitch cycle search with a resolution appropriate for any section of the pitch cycle search range of a second sub-frame when a pitch cycle search range of the second sub-frame changes in accordance with a pitch cycle of a first sub-frame. The device includes a first pitch cycle instruction unit (111), a search range calculation unit (112), and a second pitch cycle instruction unit (113). The first pitch cycle instruction unit (111) successively instructs pitch cycle search candidates in a predetermined search range having a search resolution which transits over a predetermined pitch cycle candidate for the first sub-frame. The search range calculation unit (112) calculates a predetermined range before and after the pitch cycle of the first sub-frame as the pitch cycle search range for the second sub-frame, if the predetermined range includes the predetermined pitch cycle search candidate.
    Type: Application
    Filed: February 29, 2008
    Publication date: March 11, 2010
    Applicant: PANASONIC CORPORATION
    Inventors: Kaoru Sato, Toshiyuki Morii
  • Publication number: 20100063806
    Abstract: Low bit rate audio coding such as BWE algorithm often encounters conflict goal of achieving high time resolution and high frequency resolution at the same time. In order to achieve best possible quality, input signal can be first classified into fast signal and slow signal. This invention focuses on classifying signal into fast signal and slow signal, based on at least one of the following parameters or a combination of the following parameters: spectral sharpness, temporal sharpness, pitch correlation (pitch gain), and/or spectral envelope variation. This classification information can help to choose different BWE algorithms, different coding algorithms, and different postprocessing algorithms respectively for fast signal and slow signal.
    Type: Application
    Filed: September 4, 2009
    Publication date: March 11, 2010
    Inventor: Yang Gao
  • Publication number: 20100063805
    Abstract: A decoder arrangement comprising a receiver input for parameters of frame-based coded signals and a decoder arranged to provide frames of decoded audio signals based on the parameters. The receiver input and/or the decoder is arranged to establish a time difference between the occasion when parameters of a first frame is available at the receiver input and the occasion when a decoded audio signal of the first frame is available at an output of the decoder, which time difference corresponds to at least one frame. A postfilter is connected to the output of the decoder and to the receiver input. The postfilter is arranged to provide a filtering of the frames of decoded audio signals into an output signal in response to parameters of a respective subsequent frame.
    Type: Application
    Filed: December 14, 2007
    Publication date: March 11, 2010
    Inventor: Stefan Bruhn
  • Publication number: 20100049506
    Abstract: A method, device and system to implement hiding the loss packet are provided. The provided method, device and system recover the lost frame according to the data before and after the lost frame and enhances the correlation of the recovered lost frame data and the data after the lost frame. A method and device for estimating pitch period are also provided which select a pitch period from the initial pitch period and the pitch periods corresponding to the frequencies which are one or more times higher than the frequencies corresponding to the initial pitch period as the final estimated pitch period, may improve frequency multiplication when estimating the pitch period; in addition, by tuning of the pitch period by matching the waves, the error of estimating pitch period may be reduced and the quality of the audio data may be improved.
    Type: Application
    Filed: November 2, 2009
    Publication date: February 25, 2010
    Inventors: Wuzhou Zhan, Dongqi Wang
  • Publication number: 20100049505
    Abstract: A method, device and system to implement hiding the loss packet are provided. The provided method, device and system recover the lost frame according to the data before and after the lost frame and enhances the correlation of the recovered lost frame data and the data after the lost frame. A method and device for estimating pitch period are also provided which select a pitch period from the initial pitch period and the pitch periods corresponding to the frequencies which are one or more times higher than the frequencies corresponding to the initial pitch period as the final estimated pitch period, may improve frequency multiplication when estimating the pitch period; in addition, by tuning of the pitch period by matching the waves, the error of estimating pitch period may be reduced and the quality of the audio data may be improved.
    Type: Application
    Filed: November 2, 2009
    Publication date: February 25, 2010
    Inventors: Wuzhou ZHAN, Dongqi Wang
  • Publication number: 20100036657
    Abstract: The speech estimation system of the present invention includes a transmitter (2) for transmitting a test signal, a receiver (3) for receiving the test signal, and a speech estimation unit (4) for estimating speech from a received signal. Transmitter (2) transmits the test signal toward speech organs, receiver (3) receives the test signal that has been reflected by the speech organs, and speech estimation unit (4) estimates speech or speech waveforms based on the waveform of the reflection wave of the test signal that was received by the receiver (3).
    Type: Application
    Filed: November 20, 2007
    Publication date: February 11, 2010
    Inventors: Mitsunori Morisaki, Kenichi Ishii
  • Publication number: 20100017200
    Abstract: Disclosed is an encoding device which can accurately specify a band having a large error among all the bands by using a small calculation amount. The device includes: a first position identification unit (201) which uses a first layer error conversion coefficient indicating an error of decoding signal for an input signal so as to search for a band having a large error in a relatively wide bandwidth in all the bands of the input signal and generates first position information indicating the identified band; a second position identification unit (202) which searches for a target frequency band having a large error in a relatively narrow bandwidth in the band identified by the first position identification unit (201) and generates second position information indicating the identified target frequency band; and an encoding unit (203) which encodes a first layer decoding error conversion coefficient contained in the target frequency band.
    Type: Application
    Filed: February 29, 2008
    Publication date: January 21, 2010
    Applicant: PANASONIC CORPORATION
    Inventors: Masahiro Oshikiri, Tomofumi Yamanashi, Toshiyuki Morii
  • Publication number: 20100014840
    Abstract: An information processing apparatus includes a viewer information input unit that receives input of information about a viewer viewing reproduced program content as viewer information by watching video displayed on a monitor or listening to a voice output from a speaker, an upsurge degree acquisition unit that acquires a degree of upsurge of the viewer based on the viewer information whose input is received by the viewer information input unit, and a highlight extraction unit that extracts highlights of the program content based on the degree of upsurge acquired by the upsurge degree acquisition unit.
    Type: Application
    Filed: June 30, 2009
    Publication date: January 21, 2010
    Applicant: Sony Corporation
    Inventor: Tsutomu Nagai
  • Publication number: 20100010810
    Abstract: When a decoding audio signal is to be acquired by pitch-filtering a combined signal of a sub-frame length, a decoding audio signal is continuously changed at the boundary between sub-frames. The post filter includes: a first filter coefficient calculation unit (306) which obtains a pitch filter coefficient gP(0) of a current frame so as to asymptotically approach the intensity g of the pitch filter from an initial value 0; a second filter coefficient calculation unit (307) which obtains a pitch filter coefficient gP(?1) of a preceding frame so as to asymptotically approach 0 by setting the initial value to the value of the pitch filter coefficient obtained by the first filter coefficient calculation unit (306); a filter state setting unit (308) which sets a pitch filter state fsi for each of the sub-frames; and a pitch filter (309) which pitch-filters the combined signal xi by using the pitch filter coefficients gP(?1), gP(0), and past demodulation audio signals yi?P(?1), yi?P(0).
    Type: Application
    Filed: December 13, 2007
    Publication date: January 14, 2010
    Applicant: PANASONIC CORPORATION
    Inventor: Toshiyuki Morii
  • Publication number: 20090326951
    Abstract: Ratios of powers at the peaks of respective formants of the spectrum of a pitch-cycle waveform and powers at boundaries between the formants are obtained and, when the ratios are large, bandwidth of window functions are widened and the formant waveforms are generated by multiplying generated sinusoidal waveforms from the formant parameter sets on the basis of pitch-cycle waveform generating data by the window functions of the widened bandwidth, whereby a pitch-cycle waveform is generated by the sum of these formant waveforms.
    Type: Application
    Filed: April 14, 2009
    Publication date: December 31, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Ryo Morinaka, Takehiko Kagoshima
  • Publication number: 20090326930
    Abstract: Provided is an audio decoding device capable of suppressing an information amount for a lost frame compensation process and encoding efficiency. In this device, a decoded sound source generation unit (203) generates a lost frame decoded sound source signal; a pitch pulse information decoding unit (204) decodes the pitch pulse position information and the pitch pulse amplitude information; a pitch pulse waveform learning unit (205) learns a pitch pulse learning waveform in the past frame in advance from the lost frame; a convolution unit (206) amplitude-adjusts the pitch pulse learning waveform according to the pitch pulse amplitude information, and convolutes the pitch pulse waveform into a time axis which has been amplitude-adjusted according to the pitch pulse position information; a sound source signal correction unit (207) adds or replaces the pitch pulse waveform convoluted into the time axis to the lost frame decoded sound source signal.
    Type: Application
    Filed: July 11, 2007
    Publication date: December 31, 2009
    Applicant: PANASONIC CORPORATION
    Inventors: Takuya Kawashima, Hiroyuki Ehara, Koji Yoshida
  • Publication number: 20090299736
    Abstract: To provide a speech coding technology that realizes a low bit rate and can suppress distortion of reproduction speech as compared with a conventional technology. There are provided pitch detecting means 5 that detects a pitch frequency of an input speech signal, residual calculating means 6 that calculates the difference (residual frequency) between the pitch frequency and a reference frequency, a frequency shifter 4 that shifts a frequency of the input speech signal in proportional to the residual frequency in a direction for being close to the reference frequency and equalizes a pitch period, and orthogonal transforming means that orthogonally transforms the speech signal (pitch-equalizing speech signal) output by the frequency shifter 4 by a constant number of the pitch intervals and generates transforming coefficient data, and waveform coding means that encodes the transforming coefficient data.
    Type: Application
    Filed: March 24, 2006
    Publication date: December 3, 2009
    Applicant: KYUSHU INSTITUTE OF TECHNOLOGY
    Inventor: Yasushi Sato
  • Publication number: 20090281797
    Abstract: A bit error concealment (BEC) system and method is described herein that detects and conceals the presence of click-like artifacts in an audio signal caused by bit errors introduced during transmission of the audio signal within an audio communications system. A particular embodiment of the present invention utilizes a low-complexity design that introduces no added delay and that is particularly well-suited for applications such as Bluetooth® wireless audio devices which have low cost and low power dissipation requirements.
    Type: Application
    Filed: April 28, 2009
    Publication date: November 12, 2009
    Applicant: BROADCOM CORPORATION
    Inventors: Robert W. Zopf, Vivek Kumar, Juin-Hwey Chen
  • Publication number: 20090240490
    Abstract: A method and apparatus for concealing frame loss and an apparatus for transmitting and receiving a speech signal that are capable of reducing speech quality degradation caused by packet loss are provided. In the method, when loss of a current received frame occurs, a random excitation signal having the highest correlation with a periodic excitation signal (i.e., a pitch excitation signal) decoded from a previous frame received without loss is used as a noise excitation signal to recover an excitation signal of a current lost frame. Furthermore, a third, new attenuation constant (AS) is obtained by summing a first attenuation constant (NS) obtained based on the number of continuously lost frames and a second attenuation constant (PS) predicted in consideration of change in amplitude of previously received frames to adjust the amplitude of the recovered excitation signal for the current lost frame.
    Type: Application
    Filed: January 9, 2009
    Publication date: September 24, 2009
    Applicant: Gwangju Institute of Science and Technology
    Inventors: Hong Kook Kim, Choong Sang Cho
  • Publication number: 20090222260
    Abstract: A method and system for multi-channel detection of pitch may comprise one or more of the following steps and/or means therefore: (a) sampling an audio input stream including at least a first channel and a second channel; (b) setting a search frequency for each of the first channel and the second channel; and (c) detecting a pitch of the first channel and a pitch of the second channel.
    Type: Application
    Filed: March 2, 2009
    Publication date: September 3, 2009
    Inventor: David W. Petr
  • Publication number: 20090222259
    Abstract: A feature extraction apparatus includes a spectrum calculating unit that calculates, based on an input speech signal, a frequency spectrum having frequency components obtained at regular intervals on a logarithmic frequency scale for each of frames that are defined by regular time intervals, and thereby generates a time series of the frequency spectrum; a cross-correlation coefficients calculating unit that calculates, for each target frame of the frames, a cross-correlation coefficients between frequency spectra calculated for two different frames that are in vicinity of the target frame and a predetermined frame width apart from each other; and a shift amount predicting unit that predicts a shift amount of the frequency spectra on the logarithmic frequency scale with respect to the predetermined frame width by use of the cross-correlation coefficients.
    Type: Application
    Filed: February 5, 2009
    Publication date: September 3, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Yusuke Kida, Takashi Masuko
  • Publication number: 20090210220
    Abstract: A speech analyzer includes a speech acquiring section, a frequency converting section, an autocorrelation section, and a pitch detection section. The frequency converting section converts the speech signal acquired by the speech acquiring section into a frequency spectrum. The autocorrelation section determines an autocorrelation waveform by shifting the frequency spectrum along the frequency axis. The pitch detection section determines the pitch frequency from the distance between two local crests or troughs of the autocorrelation waveform.
    Type: Application
    Filed: June 2, 2006
    Publication date: August 20, 2009
    Inventors: Shunji Mitsuyoshi, Kaoru Ogata, Fumiaki Monma
  • Publication number: 20090204396
    Abstract: The present disclosure relates to a decoding method and apparatus. The method includes: receiving data frames from the coder; if any erroneous frame appears, calculating a pitch lag parameter of the erroneous frame; decoding the data frames according to the calculated pitch lag parameter of the erroneous frame, and obtaining decoded data. The process of determining the pitch lag parameter includes: determining the number of continuous erroneous frames and the pitch lag parameter of the previous frame; adjusting the pitch lag parameter of the previous frame according to the number of the continuous erroneous frames and a preset adjustment policy, and calculating and determining the pitch lag parameter of a current erroneous frame, wherein the preset adjustment policy is adjusting the determined pitch lag parameter of the current erroneous frame within a preset value range according to the number of the continuous erroneous frames.
    Type: Application
    Filed: April 20, 2009
    Publication date: August 13, 2009
    Inventors: Jianfeng Xu, Lijing Xu, Qing Zhang, Wei Li, Shenghu Sang, Zhengzhong Du, Chen Hu
  • Publication number: 20090192789
    Abstract: Provided are a method and apparatus for effectively encoding/decoding remaining difference signals excluding sinusoidal components, from input audio signals. In the method and apparatus for encoding audio signals, sinusoidal analysis is performed on low frequency signals of less than a predetermined critical frequency in order to extract sinusoidal signals and then, an encoding operation is performed on the remaining difference signals excluding the sinusoidal signals, from input audio signals, by using linear prediction coding (LPC) analysis.
    Type: Application
    Filed: January 29, 2009
    Publication date: July 30, 2009
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Geon-hyoung LEE, Chul-woo Lee, Jong-hoon Jeong, Nam-suk Lee, Han-gil Moon
  • Publication number: 20090192788
    Abstract: In a sound processing device, a modulation spectrum specifier specifies a modulation spectrum of an input sound for each of a plurality of unit intervals. An index calculator calculates an index value corresponding to a magnitude of components of modulation frequencies belonging to a predetermined range of the modulation spectrum. A determinator determines whether the input sound of each of the unit intervals is a vocal sound or a non-vocal sound based on the index value.
    Type: Application
    Filed: January 23, 2009
    Publication date: July 30, 2009
    Applicant: Yamaha Corporation
    Inventor: Yasuo YOSHIOKA
  • Publication number: 20090182556
    Abstract: Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, a method of processing a signal representing speech can comprise receiving a frame of the signal representing speech, classifying the frame as a voiced frame, and parsing the voiced frame into one or more regions based on occurrence of one or more events within the voiced frame. For example, the one or more events can comprise one or more glottal pulses. The one or more regions may collectively represent less than all of the voiced frame.
    Type: Application
    Filed: October 23, 2008
    Publication date: July 16, 2009
    Applicant: Red Shift Company, LLC
    Inventors: Erik N. Reckase, John F. Remillard
  • Publication number: 20090177464
    Abstract: A speech encoder that analyzes and classifies each frame of speech as being periodic-like speech or non-periodic like speech where the speech encoder performs a different gain quantization process depending if the speech is periodic or not. If the speech is periodic, the improved speech encoder obtains the pitch gains from the unquantized weighted speech signal and performs a pre-vector quantization of the adaptive codebook gain GP for each subframe of the frame before subframe processing begins and a closed-loop delayed decision vector quantization of the fixed codebook gain GC. If the frame of speech is non-periodic, the speech encoder may use any known method of gain quantization.
    Type: Application
    Filed: March 6, 2009
    Publication date: July 9, 2009
    Applicants: MINDSPEED TECHNOLOGIES, INC.
    Inventors: Yang Gao, Adil Benyassine
  • Publication number: 20090171656
    Abstract: The invention concerns a method and apparatus for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder that does not have a built-in or standard FEC process. A receiver with a decoder receives encoded frames of compressed speech information transmitted from an encoder. A lost frame detector at the receiver determines if an encoded frame has been lost or corrupted in transmission, or erased. If the encoded frame is not erased, the encoded frame is decoded by a decoder and a temporary memory is updated with the decoder's output. A predetermined delay period is applied and the audio frame is then output. If the lost frame detector determines that the encoded frame is erased, a FEC module applies a frame concealment process to the signal. The FEC processing produces natural sounding synthetic speech for the erased frames.
    Type: Application
    Filed: February 20, 2009
    Publication date: July 2, 2009
    Inventor: David A. Kapilow
  • Publication number: 20090157395
    Abstract: In accordance with one aspect of the invention, a selector supports the selection of a first encoding scheme or the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal. The first encoding scheme has a pitch pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward an ideal voiced and stationary characteristic. The pre-processing procedure allows the encoder to fully capture the benefits of a bandwidth-efficient, long-term predictive procedure for a greater amount of speech components of an input speech signal than would otherwise be possible. In accordance with another aspect of the invention, the second encoding scheme entails a long-term prediction mode for encoding the pitch on a sub-frame by sub-frame basis.
    Type: Application
    Filed: January 26, 2009
    Publication date: June 18, 2009
    Applicants: MINSPEED TECHNOLOGIES, INC.
    Inventors: Huan-Yu Su, Yang Gao
  • Publication number: 20090125298
    Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.
    Type: Application
    Filed: November 3, 2008
    Publication date: May 14, 2009
    Applicant: Melodis Inc.
    Inventors: Aaron Master, Seyed Majid Emami
  • Publication number: 20090125300
    Abstract: A scalable encoding apparatus capable of reducing the bit rates of encoded parameters and also capable of efficiently encoding even audio signals in which a plurality of harmonic structures are coexistent. In the apparatus, an MDCT analyzing part (111) MDCT analyzes an audio signal (S15) for converting/encoding processes. A pitch frequency converting part (112) determines the inverse of a pitch period to calculate a pitch frequency. A selecting part (113) selects spectra located at frequencies that are integral multiples of the pitch frequency. A second layer encoding part (106) encodes the selected spectra.
    Type: Application
    Filed: October 26, 2005
    Publication date: May 14, 2009
    Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.
    Inventor: Masahiro Oshikiri
  • Publication number: 20090119096
    Abstract: A system enhances the quality of a digital speech signal that may include noise. The system identifies vocal expressions that correspond to the digital speech signal. A signal-to-noise ratio of the digital speech signal is measured before a portion of the digital speech signal is synthesized. The selected portion of the digital speech signal may have a signal-to-noise ratio below a predetermined level and the synthesis of the digital speech signal may be based on speaker identification.
    Type: Application
    Filed: October 20, 2008
    Publication date: May 7, 2009
    Inventors: Franz Gerl, Tobias Herbig, Mohamed Krini, Gerhard Uwe Schmidt
  • Publication number: 20090119097
    Abstract: The technology disclosed relates to audio signal processing. It includes a series of modules that individually are useful to solve audio signal processing problems. Among the problems addressed are buzz removal, selecting a pitch candidate among pitch candidates based on local continuity of pitch and regional octave consistency, making small adjustments in pitch, ensuring that a selected pitch is consistent with harmonic peaks, determining whether a given frame or region of frames includes harmonic, voiced signal, extracting harmonics from voice signals and detecting vibrato. One environment in which these modules are useful is transcribing singing or humming into a symbolic melody. Another environment that would usefully employ some of these modules is speech processing. Some of the modules, such as buzz removal, are useful in many other environments as well.
    Type: Application
    Filed: November 3, 2008
    Publication date: May 7, 2009
    Applicant: Melodis Inc.
    Inventors: Aaron Master, Seyed Majid Emami
  • Publication number: 20090076805
    Abstract: The present invention discloses a method for performing a frame erasure concealment to a higher-band signal, including: calculating a periodic intensity of a higher-band signal with respect to a lower-band signal; judging whether the periodic intensity of the higher-band signal is higher than or equal to a preconfigured threshold; if the periodic intensity of the higher-band signal is higher than or equal to the preconfigured threshold, using a pitch period repetition method to perform the frame erasure concealment to the higher-band signal of a current lost frame; and if the periodic intensity of the higher-band signal is lower than the preconfigured threshold, using a previous frame data repetition method to perform the frame erasure concealment to the higher-band signal of the current lost frame. The present invention further discloses a device for performing a frame erasure concealment to a higher-band signal and a speech decoder. The problem that the quality of the voice signal is lowered is avoided.
    Type: Application
    Filed: May 29, 2008
    Publication date: March 19, 2009
    Applicant: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Jianfeng Xu, Lei Miao, Chen Hu, Qing Zhang, Lijing Xu, Wei Li, Zhengzhong Du, Yi Yang, Fengyan Qi, Wuzhou Zhan, Dongqi Wang