Patents by Inventor Toshiaki Fukada

Toshiaki Fukada has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and apparatus of selecting segments for speech synthesis by way of speech segment recognition

Patent number: 7054814

Abstract: A speech segment search unit searches a speech database for speech segments that satisfy a phonetic environment, and a HMM learning unit computes the HMMs of phonemes on the basis of the search result. A segment recognition unit performs segment recognition of speech segments on the basis of the computed HMMs of the phonemes, and when the phoneme of the segment recognition result is equal to a phoneme of the source speech segment, that speech segment is registered in a segment dictionary.

Type: Grant

Filed: March 29, 2001

Date of Patent: May 30, 2006

Assignee: Canon Kabushiki Kaisha

Inventors: Yasuo Okutani, Yasuhiro Komori, Toshiaki Fukada
Segment set creating method and apparatus

Publication number: 20060069566

Abstract: A segment set before updating is read, and clustering considering a phoneme environment is performed to it. For each cluster obtained by the clustering, a representative segment of a segment set belonging to the cluster is generated. For each cluster, a segment belonging to the cluster is replaced with the representative segment so as to update the segment set.

Type: Application

Filed: September 14, 2005

Publication date: March 30, 2006

Applicant: Canon Kabushiki Kaisha

Inventors: Toshiaki Fukada, Masayuki Yamada, Yasuhiro Komori
Speech recognition method and apparatus

Publication number: 20050288929

Abstract: A speech recognition apparatus includes a word dictionary having recognition target words, a first acoustic model which expresses a reference pattern of a speech unit by one or more states, a second acoustic model which is lower in precision than said first acoustic model, selection means for selecting one of said first acoustic model and said second acoustic model on the basis of a parameter associated with a state of interest, and likelihood calculation means for calculating a likelihood of an acoustic feature parameter with respect to said acoustic model selected by said selection means.

Type: Application

Filed: June 24, 2005

Publication date: December 29, 2005

Applicant: CANON KABUSHIKI KAISHA

Inventors: Hideo Kuboyama, Toshiaki Fukada, Yasuhiro Komori
Information processing device and information processing method

Publication number: 20050267747

Abstract: In a system implementing image retrieval by performing speech recognition on voice information added to an image, the speech recognition is triggered by an event, such as an image upload event, that is not an explicit speech-recognition order event. The system obtains voice information added to an image, detects an event, and performs speech recognition on the obtained voice information in response to a specific event, even if the detected event is not an explicit speech-recognition order event.

Type: Application

Filed: May 23, 2005

Publication date: December 1, 2005

Applicant: Canon Kabushiki Kaisha

Inventors: Kenichiro Nakagawa, Makoto Hirota, Hiromi Ikeda, Tsuyoshi Yagisawa, Hiroki Yamamoto, Toshiaki Fukada, Yasuhiro Komori
Signal processing apparatus and method

Publication number: 20050216261

Abstract: A signal processing apparatus and method for performing a robust endpoint detection of a signal are provided. An input signal sequence is divided into frames each of which has a predetermined time length. The presence of the signal in the frame is detected. After that, the filter process of smoothing the detection result by using the detection result for a past frame is applied to the detection result for a current frame. The filter output is compared with a predetermined threshold value to determine the state of the signal sequence of the current frame on the basis of the comparison result.

Type: Application

Filed: March 18, 2005

Publication date: September 29, 2005

Applicant: CANON KABUSHIKI KAISHA

Inventors: Philip Garner, Toshiaki Fukada, Yasuhiro Komori
Speech signal processing apparatus and method, and storage medium

Publication number: 20050209855

Abstract: A speech segment search unit searches a speech database for speech segments that satisfy a phonetic environment, and a HMM learning unit computes the HMMs of phonemes on the basis of the search result. A segment recognition unit performs segment recognition of speech segments on the basis of the computed HMMs of the phonemes, and when the phoneme of the segment recognition result is equal to a phoneme of the source speech segment, that speech segment is registered in a segment dictionary.

Type: Application

Filed: May 11, 2005

Publication date: September 22, 2005

Applicant: CANON KABUSHIKI KAISHA

Inventors: Yasuo Okutani, Yasuhiro Komori, Toshiaki Fukada
Speech recognition method and apparatus

Publication number: 20050131699

Abstract: A speech recognition apparatus and method of this invention manage previously input frequencies of occurrence for respective geographical names to be recognized (202), update the probability of occurrence of the geographical name to be recognized of interest on the basis of the frequency of occurrence of that geographical name, and those of geographical names to be recognized located within a predetermined region including the position of the geographical name of interest using a table (114) that describes correspondence between the geographical names to be recognized and their positions, and perform this update process for respective geographical names to be recognized (203).

Type: Application

Filed: December 8, 2004

Publication date: June 16, 2005

Applicant: CANON KABUSHIKI KAISHA

Inventor: Toshiaki Fukada
Apparatus and method for detecting signal

Publication number: 20050131689

Abstract: Robust signal detection against various types of background noise is implemented. According to a signal detection apparatus and method of this invention, the feature amount of an input signal sequence and the feature amount of a noise component contained in the signal sequence are extracted. After that, the first likelihood indicating probability that the signal sequence is detected and the second likelihood indicating probability that the noise component is detected are calculated on the basis of a predetermined signal-to-noise ratio and the extracted feature amount of the signal sequence. Additionally, a likelihood ratio indicating the ratio between the first likelihood and the second likelihood is calculated. Detection of the signal sequence is determined on the basis of the likelihood ratio.

Type: Application

Filed: December 9, 2004

Publication date: June 16, 2005

Applicant: CANNON KAKBUSHIKI KAISHA

Inventors: Philip Garner, Toshiaki Fukada, Yasuhiro Komori
Information processing method and apparatus

Publication number: 20050102139

Abstract: In order to associate image data with speech data, a character detection unit detects a text region from the image data, and a character recognition unit recognizes a character from the text region. A speech detection unit detects a speech period from speech data, and a speech recognition unit recognizes speech from the speech period. An image-and-speech associating unit associates the character with the speech by performing at least character string matching or phonetic string matching between the recognized character and speech. Therefore, a portion of the image data and a portion of the speech data can be associated with each other.

Type: Application

Filed: November 5, 2004

Publication date: May 12, 2005

Applicant: Canon Kabushiki Kaisha

Inventor: Toshiaki Fukada
Information processing method and information processing apparatus

Publication number: 20050097439

Abstract: In an information processing apparatus or method for presenting multimedia data, a storage unit holds an object in an image, such as an image, characters, or symbols, and sound data associated with the object. Metadata of the object is referred to, and an output parameter of the sound data associated with the object is determined based on the metadata. Then, a sound output unit outputs the sound data at a sound volume or the like based on the output parameter.

Type: Application

Filed: October 12, 2004

Publication date: May 5, 2005

Applicant: Canon Kabushiki Kaisha

Inventors: Hiromi Ikeda, Tsuyoshi Yagisawa, Toshiaki Fukada
Data processing method, data processing apparatus, storage medium and program

Publication number: 20050089017

Abstract: The present invention is purposed to improve user friendliness in generation of practical data access means. To achieve this object, the present invention provides a data processing method of registering a path for data access and link data for the path. The method comprises: a generation step of the link data candidate generation unit 202 for generating a link data candidate based on a file accessed from the path which is inputted for data access; a display step of the link data candidate exhibiting unit 203 for displaying the generated link data candidate; a recognition step of the link data selection unit 204 for recognizing link data selected from the displayed link data candidate; and a registration step of the link data registration unit 205 for registering the recognized link data in association with the path of the accessed file.

Type: Application

Filed: October 25, 2004

Publication date: April 28, 2005

Applicant: CANON KABUSHIKI KAISHA

Inventors: Toshiaki Fukada, Yasuhiro Komori
Text structure for voice synthesis, voice synthesis method, voice synthesis apparatus, and computer program thereof

Publication number: 20050065795

Abstract: In a voice synthesis apparatus, by bounding a desired range of input text to be output by, e.g., a start tag “<morphing type=“emotion” start=“happy” end=“angry”>” and end tag </morphing>, a feature of synthetic voice is continuously changed while gradually changing voice from a happy voice to an angry voice upon outputting synthetic voice.

Type: Application

Filed: August 10, 2004

Publication date: March 24, 2005

Applicant: CANON KABUSHIKI KAISHA

Inventors: Masahiro Mutsuno, Toshiaki Fukada
Speech information processing method and apparatus and storage medium using a segment pitch pattern model

Publication number: 20050055207

Abstract: A speech information processing apparatus synthesizes speech with natural intonation by modeling time change in fundamental frequency of a predetermined unit of phoneme. When a predetermined unit of phonological series is inputted, fundamental frequencies of respective phonemes constructing the phonological series are generated based on a segment pitch pattern model. Phonemes are synthesized based on the generated fundamental frequencies of the respective phonemes.

Type: Application

Filed: October 18, 2004

Publication date: March 10, 2005

Applicant: CANON KABUSHIKI KAISHA

Inventor: Toshiaki Fukada
Speech information processing method and apparatus and storage medium using a segment pitch pattern model

Patent number: 6826531

Abstract: A speech information processing apparatus synthesizes speech with natural intonation by modeling time change in fundamental frequency of a predetermined unit of phoneme. When a predetermined unit of phonological series is inputted, fundamental frequencies of respective phonemes constructing the phonological series are generated based on a segment pitch pattern model. Phonemes are synthesized based on the generated fundamental frequencies of the respective phonemes.

Type: Grant

Filed: March 28, 2001

Date of Patent: November 30, 2004

Assignee: Canon Kabushiki Kaisha

Inventor: Toshiaki Fukada
Speech information processing method and apparatus and storage medium

Publication number: 20040215459

Abstract: A speech information processing apparatus which sets the duration of phonological series with accuracy, and sets a natural phoneme duration in accordance with phonemic/linguistic environment. For this purpose, the duration of a predetermined unit of phonological series is obtained based on a duration model for an entire segment. Then, duration of each of phonemes constructing the phonological series is obtained based on a duration model for a partial segment (S303). Then, duration of each phoneme is set based on the duration of the phonological series and the duration of each phoneme.

Type: Application

Filed: May 25, 2004

Publication date: October 28, 2004

Applicant: CANON KABUSHIKI KAISHA

Inventor: Toshiaki Fukada
Speech information processing method and apparatus and storage medium

Patent number: 6778960

Abstract: A speech information processing apparatus which sets the duration of phonological series with accuracy, and sets a natural phoneme duration in accordance with phonemic/linguistic environment. For this purpose, the duration of predetermined unit of phonological series is obtained based on a duration model for entire segment. Then duration of each of phonemes constructing the phonological series is obtained based on the duration model for the entire segment. Then duration of each phoneme is set based on the duration of the phonological series and the duration of each phoneme.

Type: Grant

Filed: March 28, 2001

Date of Patent: August 17, 2004

Assignee: Canon Kabushiki Kaisha

Inventor: Toshiaki Fukada
Speech synthesis method and apparatus, and dictionary generation method and apparatus

Publication number: 20030229496

Abstract: In a speech synthesis process, micro-segments are cut from acquired waveform data and a window function. The obtained micro-segments are re-arranged to implement a desired prosody, and superposed data is generated by superposing the re-arranged micro-segments, so as to obtain synthetic speech waveform data. A spectrum correction filter is formed based on the acquired waveform data. At least one of the waveform data, micro-segments, and superposed data is corrected using the spectrum correction filter. In this way, “blur” of a speech spectrum due to the window function applied to obtain micro-segments is reduced, and speech synthesis with high sound quality is realized.

Type: Application

Filed: June 2, 2003

Publication date: December 11, 2003

Applicant: Canon Kabushiki Kaisha

Inventors: Masayuki Yamada, Yasuhiro Komori, Toshiaki Fukada
Information processing apparatus and method with speech synthesis function

Publication number: 20030158735

Abstract: With this invention, an information processing apparatus which has an audio data playback function and text-to-speech synthesis function allows the user to input an instruction by a fewer operations and provides a fast-forward/fast-reverse function optimal to speech synthesis. During speech synthesis, an instruction input by a button operation is supplied to a speech synthesis unit. When playback of audio data is underway, but speech synthesis is inactive, an instruction input by a button operation is supplied to an audio data playback unit. In a fast-forward mode, an abstract is read aloud or head parts of sentences are read aloud. In a fast-reverse mode, head parts of sentences are read aloud. Also, given tones are generated in correspondence with skipped parts.

Type: Application

Filed: February 11, 2003

Publication date: August 21, 2003

Applicant: Canon Kabushiki Kaisha

Inventors: Masayuki Yamada, Katsuhiko Kawasaki, Toshiaki Fukada, Yasuo Okutani
Speech signal processing apparatus and method, and storage medium

Publication number: 20020051955

Abstract: A speech segment search unit searches a speech database for speech segments that satisfy a phonetic environment, and a HMM learning unit computes the HMMs of phonemes on the basis of the search result. A segment recognition unit performs segment recognition of speech segments on the basis of the computed HMMs of the phonemes, and when the phoneme of the segment recognition result is equal to a phoneme of the source speech segment, that speech segment is registered in a segment dictionary.

Type: Application

Filed: March 29, 2001

Publication date: May 2, 2002

Inventors: Yasuo Okutani, Yasuhiro Komori, Toshiaki Fukada
Speech data recording apparatus and method for speech recognition learning

Publication number: 20020049590

Abstract: In a speech recording arrangement, a sentence to be recorded for speech recognition learning is presented to a user. Speech input by the user for the presented sentence is recognized to obtain a recognized character string. The speech pattern of the recognized character string is compared with the speech pattern of the presented sentence by DP matching to obtain a matching rate therebetween. It is determined whether the matching rate exceeds a predetermined level. If so, the input speech is recorded as learning data. If not, an unmatched portion between the recognized character string and the recording sentence is presented to the user. The user is then instructed to input the speech once again. With this arrangement, speech data with very few improperly pronounced words can be efficiently recorded.

Type: Application

Filed: October 15, 2001

Publication date: April 25, 2002

Inventors: Hiroaki Yoshino, Toshiaki Fukada

prev 1 2 3 4 next