Abstract: The present invention provides pitch conversion processing technology capable of minimizing the distortion of speech sound naturalness. A speech waveform in a pitch-unit is considered to be divided into two segments: 1) the segment of ?, that starts from the minus peak, where the waveform depending on the shape of vocal tracts appears, and 2) the segment of ? where the waveform depending on the vocal tract shape is attenuating and converging on the next minus peak. In addition, ? is the point where a minus peak appears along with the glottal closure. Based on characteristics of speech waveforms, the present invention processes waveform for converting pitch in the segment of ? just before the next minus peak, which is least affected by the minus peak associated with the glottal closure. As such, waveform processing can be performed by keeping the complete contour of waveform at around the peak, and thereby reducing the effects of pitch conversion.
Abstract: Given phonetic information is divided into speech units of extended CV which is a contiguous sequence of phonemes without clear distinction containing a vowel or some vowels. Contour of vocal tract transmission function of phoneme of the speech unit of extended CV is obtained from the phoneme directory which contains a contour of vocal tract transmission function of each phoneme associated with phonetic information in a unit of extended CV. Speech waveform data is generated based on the contour of vocal tract transmission function of phoneme of the speech unit of extended CV. Speech waveform data is converted into analog voice signal.
Abstract: Sound generating parameters are used for outputting fundamental frequency and a command regarding prosody, and a sound source generator. The sound generation device further includes use of an accent command and a descent command for calculating fundamental frequency and incorporates a rhythm command, which is representable by a sine wave. The device also uses character string analysis for analyzing a character string and generating a command concerning phoneme and prosody, a calculating element for outputting fundamental frequency as sound generation parameters, which depends on prosody, a sound source generator, and an articulator that depends on a phoneme command.
Abstract: This invention relates to converting the characteristics of sounds such as oices, musical tones, natural sounds, and so on, and more specifically to facilitating the conversion operation, and also to sound-label association suitable for the characteristic conversion. Various embodiments of the invention comprise several of the following elements to provide useful results: sound-label data holding means, display control means, conversion means, sound-label dividing means, label-data dividing means, association forming means, data inputting means, and communication means. Other embodiments of the invention may be practiced as processes or articles of manufacture.
Type:
Grant
Filed:
March 11, 1997
Date of Patent:
September 21, 1999
Assignees:
Arcadia, Inc., ATR Human Information Processing Research Laboratories, Co., Inc.