Patents by Inventor Toshiaki Fukada

Toshiaki Fukada has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7054814
    Abstract: A speech segment search unit searches a speech database for speech segments that satisfy a phonetic environment, and a HMM learning unit computes the HMMs of phonemes on the basis of the search result. A segment recognition unit performs segment recognition of speech segments on the basis of the computed HMMs of the phonemes, and when the phoneme of the segment recognition result is equal to a phoneme of the source speech segment, that speech segment is registered in a segment dictionary.
    Type: Grant
    Filed: March 29, 2001
    Date of Patent: May 30, 2006
    Assignee: Canon Kabushiki Kaisha
    Inventors: Yasuo Okutani, Yasuhiro Komori, Toshiaki Fukada
  • Publication number: 20060069566
    Abstract: A segment set before updating is read, and clustering considering a phoneme environment is performed to it. For each cluster obtained by the clustering, a representative segment of a segment set belonging to the cluster is generated. For each cluster, a segment belonging to the cluster is replaced with the representative segment so as to update the segment set.
    Type: Application
    Filed: September 14, 2005
    Publication date: March 30, 2006
    Applicant: Canon Kabushiki Kaisha
    Inventors: Toshiaki Fukada, Masayuki Yamada, Yasuhiro Komori
  • Publication number: 20050288929
    Abstract: A speech recognition apparatus includes a word dictionary having recognition target words, a first acoustic model which expresses a reference pattern of a speech unit by one or more states, a second acoustic model which is lower in precision than said first acoustic model, selection means for selecting one of said first acoustic model and said second acoustic model on the basis of a parameter associated with a state of interest, and likelihood calculation means for calculating a likelihood of an acoustic feature parameter with respect to said acoustic model selected by said selection means.
    Type: Application
    Filed: June 24, 2005
    Publication date: December 29, 2005
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Hideo Kuboyama, Toshiaki Fukada, Yasuhiro Komori
  • Publication number: 20050267747
    Abstract: In a system implementing image retrieval by performing speech recognition on voice information added to an image, the speech recognition is triggered by an event, such as an image upload event, that is not an explicit speech-recognition order event. The system obtains voice information added to an image, detects an event, and performs speech recognition on the obtained voice information in response to a specific event, even if the detected event is not an explicit speech-recognition order event.
    Type: Application
    Filed: May 23, 2005
    Publication date: December 1, 2005
    Applicant: Canon Kabushiki Kaisha
    Inventors: Kenichiro Nakagawa, Makoto Hirota, Hiromi Ikeda, Tsuyoshi Yagisawa, Hiroki Yamamoto, Toshiaki Fukada, Yasuhiro Komori
  • Publication number: 20050216261
    Abstract: A signal processing apparatus and method for performing a robust endpoint detection of a signal are provided. An input signal sequence is divided into frames each of which has a predetermined time length. The presence of the signal in the frame is detected. After that, the filter process of smoothing the detection result by using the detection result for a past frame is applied to the detection result for a current frame. The filter output is compared with a predetermined threshold value to determine the state of the signal sequence of the current frame on the basis of the comparison result.
    Type: Application
    Filed: March 18, 2005
    Publication date: September 29, 2005
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Philip Garner, Toshiaki Fukada, Yasuhiro Komori
  • Publication number: 20050209855
    Abstract: A speech segment search unit searches a speech database for speech segments that satisfy a phonetic environment, and a HMM learning unit computes the HMMs of phonemes on the basis of the search result. A segment recognition unit performs segment recognition of speech segments on the basis of the computed HMMs of the phonemes, and when the phoneme of the segment recognition result is equal to a phoneme of the source speech segment, that speech segment is registered in a segment dictionary.
    Type: Application
    Filed: May 11, 2005
    Publication date: September 22, 2005
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Yasuo Okutani, Yasuhiro Komori, Toshiaki Fukada
  • Publication number: 20050131699
    Abstract: A speech recognition apparatus and method of this invention manage previously input frequencies of occurrence for respective geographical names to be recognized (202), update the probability of occurrence of the geographical name to be recognized of interest on the basis of the frequency of occurrence of that geographical name, and those of geographical names to be recognized located within a predetermined region including the position of the geographical name of interest using a table (114) that describes correspondence between the geographical names to be recognized and their positions, and perform this update process for respective geographical names to be recognized (203).
    Type: Application
    Filed: December 8, 2004
    Publication date: June 16, 2005
    Applicant: CANON KABUSHIKI KAISHA
    Inventor: Toshiaki Fukada
  • Publication number: 20050131689
    Abstract: Robust signal detection against various types of background noise is implemented. According to a signal detection apparatus and method of this invention, the feature amount of an input signal sequence and the feature amount of a noise component contained in the signal sequence are extracted. After that, the first likelihood indicating probability that the signal sequence is detected and the second likelihood indicating probability that the noise component is detected are calculated on the basis of a predetermined signal-to-noise ratio and the extracted feature amount of the signal sequence. Additionally, a likelihood ratio indicating the ratio between the first likelihood and the second likelihood is calculated. Detection of the signal sequence is determined on the basis of the likelihood ratio.
    Type: Application
    Filed: December 9, 2004
    Publication date: June 16, 2005
    Applicant: CANNON KAKBUSHIKI KAISHA
    Inventors: Philip Garner, Toshiaki Fukada, Yasuhiro Komori
  • Publication number: 20050102139
    Abstract: In order to associate image data with speech data, a character detection unit detects a text region from the image data, and a character recognition unit recognizes a character from the text region. A speech detection unit detects a speech period from speech data, and a speech recognition unit recognizes speech from the speech period. An image-and-speech associating unit associates the character with the speech by performing at least character string matching or phonetic string matching between the recognized character and speech. Therefore, a portion of the image data and a portion of the speech data can be associated with each other.
    Type: Application
    Filed: November 5, 2004
    Publication date: May 12, 2005
    Applicant: Canon Kabushiki Kaisha
    Inventor: Toshiaki Fukada
  • Publication number: 20050097439
    Abstract: In an information processing apparatus or method for presenting multimedia data, a storage unit holds an object in an image, such as an image, characters, or symbols, and sound data associated with the object. Metadata of the object is referred to, and an output parameter of the sound data associated with the object is determined based on the metadata. Then, a sound output unit outputs the sound data at a sound volume or the like based on the output parameter.
    Type: Application
    Filed: October 12, 2004
    Publication date: May 5, 2005
    Applicant: Canon Kabushiki Kaisha
    Inventors: Hiromi Ikeda, Tsuyoshi Yagisawa, Toshiaki Fukada
  • Publication number: 20050089017
    Abstract: The present invention is purposed to improve user friendliness in generation of practical data access means. To achieve this object, the present invention provides a data processing method of registering a path for data access and link data for the path. The method comprises: a generation step of the link data candidate generation unit 202 for generating a link data candidate based on a file accessed from the path which is inputted for data access; a display step of the link data candidate exhibiting unit 203 for displaying the generated link data candidate; a recognition step of the link data selection unit 204 for recognizing link data selected from the displayed link data candidate; and a registration step of the link data registration unit 205 for registering the recognized link data in association with the path of the accessed file.
    Type: Application
    Filed: October 25, 2004
    Publication date: April 28, 2005
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Toshiaki Fukada, Yasuhiro Komori
  • Publication number: 20050065795
    Abstract: In a voice synthesis apparatus, by bounding a desired range of input text to be output by, e.g., a start tag “<morphing type=“emotion” start=“happy” end=“angry”>” and end tag </morphing>, a feature of synthetic voice is continuously changed while gradually changing voice from a happy voice to an angry voice upon outputting synthetic voice.
    Type: Application
    Filed: August 10, 2004
    Publication date: March 24, 2005
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Masahiro Mutsuno, Toshiaki Fukada
  • Publication number: 20050055207
    Abstract: A speech information processing apparatus synthesizes speech with natural intonation by modeling time change in fundamental frequency of a predetermined unit of phoneme. When a predetermined unit of phonological series is inputted, fundamental frequencies of respective phonemes constructing the phonological series are generated based on a segment pitch pattern model. Phonemes are synthesized based on the generated fundamental frequencies of the respective phonemes.
    Type: Application
    Filed: October 18, 2004
    Publication date: March 10, 2005
    Applicant: CANON KABUSHIKI KAISHA
    Inventor: Toshiaki Fukada
  • Patent number: 6826531
    Abstract: A speech information processing apparatus synthesizes speech with natural intonation by modeling time change in fundamental frequency of a predetermined unit of phoneme. When a predetermined unit of phonological series is inputted, fundamental frequencies of respective phonemes constructing the phonological series are generated based on a segment pitch pattern model. Phonemes are synthesized based on the generated fundamental frequencies of the respective phonemes.
    Type: Grant
    Filed: March 28, 2001
    Date of Patent: November 30, 2004
    Assignee: Canon Kabushiki Kaisha
    Inventor: Toshiaki Fukada
  • Publication number: 20040215459
    Abstract: A speech information processing apparatus which sets the duration of phonological series with accuracy, and sets a natural phoneme duration in accordance with phonemic/linguistic environment. For this purpose, the duration of a predetermined unit of phonological series is obtained based on a duration model for an entire segment. Then, duration of each of phonemes constructing the phonological series is obtained based on a duration model for a partial segment (S303). Then, duration of each phoneme is set based on the duration of the phonological series and the duration of each phoneme.
    Type: Application
    Filed: May 25, 2004
    Publication date: October 28, 2004
    Applicant: CANON KABUSHIKI KAISHA
    Inventor: Toshiaki Fukada
  • Patent number: 6778960
    Abstract: A speech information processing apparatus which sets the duration of phonological series with accuracy, and sets a natural phoneme duration in accordance with phonemic/linguistic environment. For this purpose, the duration of predetermined unit of phonological series is obtained based on a duration model for entire segment. Then duration of each of phonemes constructing the phonological series is obtained based on the duration model for the entire segment. Then duration of each phoneme is set based on the duration of the phonological series and the duration of each phoneme.
    Type: Grant
    Filed: March 28, 2001
    Date of Patent: August 17, 2004
    Assignee: Canon Kabushiki Kaisha
    Inventor: Toshiaki Fukada
  • Publication number: 20030229496
    Abstract: In a speech synthesis process, micro-segments are cut from acquired waveform data and a window function. The obtained micro-segments are re-arranged to implement a desired prosody, and superposed data is generated by superposing the re-arranged micro-segments, so as to obtain synthetic speech waveform data. A spectrum correction filter is formed based on the acquired waveform data. At least one of the waveform data, micro-segments, and superposed data is corrected using the spectrum correction filter. In this way, “blur” of a speech spectrum due to the window function applied to obtain micro-segments is reduced, and speech synthesis with high sound quality is realized.
    Type: Application
    Filed: June 2, 2003
    Publication date: December 11, 2003
    Applicant: Canon Kabushiki Kaisha
    Inventors: Masayuki Yamada, Yasuhiro Komori, Toshiaki Fukada
  • Publication number: 20030158735
    Abstract: With this invention, an information processing apparatus which has an audio data playback function and text-to-speech synthesis function allows the user to input an instruction by a fewer operations and provides a fast-forward/fast-reverse function optimal to speech synthesis. During speech synthesis, an instruction input by a button operation is supplied to a speech synthesis unit. When playback of audio data is underway, but speech synthesis is inactive, an instruction input by a button operation is supplied to an audio data playback unit. In a fast-forward mode, an abstract is read aloud or head parts of sentences are read aloud. In a fast-reverse mode, head parts of sentences are read aloud. Also, given tones are generated in correspondence with skipped parts.
    Type: Application
    Filed: February 11, 2003
    Publication date: August 21, 2003
    Applicant: Canon Kabushiki Kaisha
    Inventors: Masayuki Yamada, Katsuhiko Kawasaki, Toshiaki Fukada, Yasuo Okutani
  • Publication number: 20020051955
    Abstract: A speech segment search unit searches a speech database for speech segments that satisfy a phonetic environment, and a HMM learning unit computes the HMMs of phonemes on the basis of the search result. A segment recognition unit performs segment recognition of speech segments on the basis of the computed HMMs of the phonemes, and when the phoneme of the segment recognition result is equal to a phoneme of the source speech segment, that speech segment is registered in a segment dictionary.
    Type: Application
    Filed: March 29, 2001
    Publication date: May 2, 2002
    Inventors: Yasuo Okutani, Yasuhiro Komori, Toshiaki Fukada
  • Publication number: 20020049590
    Abstract: In a speech recording arrangement, a sentence to be recorded for speech recognition learning is presented to a user. Speech input by the user for the presented sentence is recognized to obtain a recognized character string. The speech pattern of the recognized character string is compared with the speech pattern of the presented sentence by DP matching to obtain a matching rate therebetween. It is determined whether the matching rate exceeds a predetermined level. If so, the input speech is recorded as learning data. If not, an unmatched portion between the recognized character string and the recording sentence is presented to the user. The user is then instructed to input the speech once again. With this arrangement, speech data with very few improperly pronounced words can be efficiently recorded.
    Type: Application
    Filed: October 15, 2001
    Publication date: April 25, 2002
    Inventors: Hiroaki Yoshino, Toshiaki Fukada