Patents by Inventor Yasuo Okutani

Yasuo Okutani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7039588
    Abstract: Input text data undergoes language analysis to generate prosody, and a speech database is searched for a synthesis unit on the basis of the prosody. A modification distortion of the found synthesis unit, and concatenation distortions upon connecting that synthesis unit to those in the preceding phoneme are computed, and a distortion determination unit weights the modification and concatenation distortions to determine the total distortion. An Nbest determination unit obtains N best paths that can minimize the distortion using the A* search algorithm, and a registration unit determination unit selects a synthesis unit to be registered in a synthesis unit inventory on the basis of the N best paths in the order of frequencies of occurrence, and registers it in the synthesis unit inventory.
    Type: Grant
    Filed: August 30, 2004
    Date of Patent: May 2, 2006
    Assignee: Canon Kabushiki Kaisha
    Inventors: Yasuo Okutani, Yasuhiro Komori
  • Publication number: 20060085194
    Abstract: Input text data undergoes language analysis to generate prosody, and a speech database is searched for a synthesis unit on the basis of the prosody. A modification distortion of the found synthesis unit, and concatenation distortions upon connecting that synthesis unit to those in the preceding phoneme are computed, and a distortion determination unit weights the modification and concatenation distortions to determine the total distortion. An Nbest determination unit obtains N best paths that can minimize the distortion using the A* search algorithm, and a registration unit determination unit selects a synthesis unit to be registered in a synthesis unit inventory on the basis of the N best paths in the order of frequencies of occurrence, and registers it in the synthesis unit inventory.
    Type: Application
    Filed: December 7, 2005
    Publication date: April 20, 2006
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Yasuo Okutani, Yasuhiro Komori
  • Patent number: 7031919
    Abstract: A speech synthesizing apparatus for synthesizing a speech waveform stores speech data, which is obtained by adding attribute information onto phoneme data, in a database. In accordance with prescribed retrieval conditions, a phoneme retrieval unit retrieves phoneme data from the speech data that has been stored in the database and retains the retrieved results in a retrieved-result storage area. A processing unit for assigning a power penalty and a processing unit for assigning a phoneme-duration penalty assign the penalties, on the basis of power and phoneme duration constituting the attribute information, to a set of phoneme data stored in the retrieved-result storage area. A processing unit for determining typical phoneme data performs sorting on the basis of the assigned penalties and, based upon the stored results, selects phoneme data to be employed in the synthesis of a speech waveform.
    Type: Grant
    Filed: August 30, 1999
    Date of Patent: April 18, 2006
    Assignee: Canon Kabushiki Kaisha
    Inventors: Yasuo Okutani, Masayuki Yamada
  • Publication number: 20060047508
    Abstract: A speech processing apparatus and method for restricting use of generated speech data for purposes other than a particular purpose. An adder adds predetermined audio data within audio-frequency band excluding predetermined frequency band, to input speech data. A band limiting filter limits the speech data to which the predetermined audio data has been added by the adder, to the predetermined frequency band. A communication device transmits the speech data which has been limited to the predetermined frequency band by the band limiting filter.
    Type: Application
    Filed: August 24, 2005
    Publication date: March 2, 2006
    Inventor: Yasuo Okutani
  • Publication number: 20060031072
    Abstract: An electronic dictionary apparatus and its control method are provided. A database contains entry words and advanced phonetic information corresponding to each entry word. A dictionary search section searches the database using an entry word specified by a user as a search key and acquires the advanced phonetic information corresponding to the entry word. A display section displays simple phonetic information generated based on the acquired advanced phonetic information. A speech output section performs speech synthesis based on the acquired advanced phonetic information and outputs the synthesized speech.
    Type: Application
    Filed: August 4, 2005
    Publication date: February 9, 2006
    Inventors: Yasuo Okutani, Michio Aizawa
  • Patent number: 6980955
    Abstract: Input text data undergoes language analysis to generate prosody, and a speech database is searched for a synthesis unit on the basis of the prosody. A modification distortion of the found synthesis unit, and concatenation distortions upon connecting that synthesis unit to those in the preceding phoneme are computed, and a distortion determination unit weights the modification and concatenation distortions to determine the total distortion. An Nbest determination unit obtains N best paths that can minimize the distortion using the A* search algorithm, and a registration unit determination unit selects a synthesis unit to be registered in a synthesis unit inventory on the basis of the N best paths in the order of frequencies of occurrence, and registers it in the synthesis unit inventory.
    Type: Grant
    Filed: March 28, 2001
    Date of Patent: December 27, 2005
    Assignee: Canon Kabushiki Kaisha
    Inventors: Yasuo Okutani, Yasuhiro Komori
  • Publication number: 20050209855
    Abstract: A speech segment search unit searches a speech database for speech segments that satisfy a phonetic environment, and a HMM learning unit computes the HMMs of phonemes on the basis of the search result. A segment recognition unit performs segment recognition of speech segments on the basis of the computed HMMs of the phonemes, and when the phoneme of the segment recognition result is equal to a phoneme of the source speech segment, that speech segment is registered in a segment dictionary.
    Type: Application
    Filed: May 11, 2005
    Publication date: September 22, 2005
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Yasuo Okutani, Yasuhiro Komori, Toshiaki Fukada
  • Publication number: 20050191036
    Abstract: In cases where at least one item of sound information has been associated with at least image, at least one desired item of sound information is selected and the sound information is played back in a prescribed order. According, in an information processing apparatus, a playback sequence decision unit (103) reads in image data as well as sound data, which has been assigned within the image data, from a image/sound data storage unit (107), generates a still image in which the positions at which sound data has been recorded is denoted on the image, and displays the generated still image on a image display unit (106). A sound data specifying unit (102) searches the image/sound data storage unit (107) for sound data that has been associated with the interior of an image area specified by an input from a user. When applicable sound data is found to exist, the playback sequence decision unit (103) decides the order in which the applicable sound data is to be played back.
    Type: Application
    Filed: February 7, 2005
    Publication date: September 1, 2005
    Applicant: Canon Kabushiki Kaisha
    Inventors: Yasuo Okutani, Yasuhiro Komori
  • Publication number: 20050027532
    Abstract: Input text data undergoes language analysis to generate prosody, and a speech database is searched for a synthesis unit on the basis of the prosody. A modification distortion of the found synthesis unit, and concatenation distortions upon connecting that synthesis unit to those in the preceding phoneme are computed, and a distortion determination unit weights the modification and concatenation distortions to determine the total distortion. An Nbest determination unit obtains N best paths that can minimize the distortion using the A* search algorithm, and a registration unit determination unit selects a synthesis unit to be registered in a synthesis unit inventory on the basis of the N best paths in the order of frequencies of occurrence, and registers it in the synthesis unit inventory.
    Type: Application
    Filed: August 30, 2004
    Publication date: February 3, 2005
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Yasuo Okutani, Yasuhiro Komori
  • Publication number: 20040093209
    Abstract: A data input device for inputting numeric data by voice includes a range prediction part, a history holding part, a speech recognition part, a recognition result holding part, a comparison part, a presentation part, and a result storing part. The range prediction part estimates a range of a value expected to be input on the basis of meter-reading history data held in the history holding part. The speech recognition part recognizes speech representing a meter reading and stores the recognition result in the recognition result holding part. The comparison part determines whether or not the meter reading for this month represented by the data stored in the recognition result holding part is within the prediction range. If the meter reading for this month is within the prediction range, the presentation part presents the recognition result to a user, and the speech recognition result is stored in the result storing part.
    Type: Application
    Filed: October 20, 2003
    Publication date: May 13, 2004
    Applicant: Canon Kabushiki Kaisha
    Inventor: Yasuo Okutani
  • Publication number: 20040088165
    Abstract: Even when a copying machine with a voice guidance function is used, a problem of wastefully copying wrong documents or documents with missing pages remains unsolved for visually impaired persons. To this end, a document image is read, character strings on the read document image are recognized, a character string indicating the contents of the document is chosen from the recognized character strings, the chosen character string is converted into speech, and synthetic speech is output.
    Type: Application
    Filed: July 28, 2003
    Publication date: May 6, 2004
    Applicant: Canon Kabushiki Kaisha
    Inventors: Yasuo Okutani, Tetsuo Kosaka
  • Publication number: 20030158735
    Abstract: With this invention, an information processing apparatus which has an audio data playback function and text-to-speech synthesis function allows the user to input an instruction by a fewer operations and provides a fast-forward/fast-reverse function optimal to speech synthesis. During speech synthesis, an instruction input by a button operation is supplied to a speech synthesis unit. When playback of audio data is underway, but speech synthesis is inactive, an instruction input by a button operation is supplied to an audio data playback unit. In a fast-forward mode, an abstract is read aloud or head parts of sentences are read aloud. In a fast-reverse mode, head parts of sentences are read aloud. Also, given tones are generated in correspondence with skipped parts.
    Type: Application
    Filed: February 11, 2003
    Publication date: August 21, 2003
    Applicant: Canon Kabushiki Kaisha
    Inventors: Masayuki Yamada, Katsuhiko Kawasaki, Toshiaki Fukada, Yasuo Okutani
  • Publication number: 20030125949
    Abstract: A speech synthesizing apparatus for synthesizing a speech waveform stores speech data, which is obtained by adding attribute information onto phoneme data, in a database. In accordance with prescribed retrieval conditions, a phoneme retrieval unit retrieves phoneme data from the speech data that has been stored in the database and retains the retrieved results in a retrieved-result storage area. A processing unit for assigning a power penalty and a processing unit for assigning a phoneme-duration penalty assign the penalties, on the basis of power and phoneme duration constituting the attribute information, to a set of phoneme data stored in the retrieved-result storage area. A processing unit for determining typical phoneme data performs sorting on the basis of the assigned penalties and, based upon the stored results, selects phoneme data to be employed in the synthesis of a speech waveform.
    Type: Application
    Filed: August 30, 1999
    Publication date: July 3, 2003
    Inventors: YASUO OKUTANI, MASAYUKI YAMADA
  • Publication number: 20020051955
    Abstract: A speech segment search unit searches a speech database for speech segments that satisfy a phonetic environment, and a HMM learning unit computes the HMMs of phonemes on the basis of the search result. A segment recognition unit performs segment recognition of speech segments on the basis of the computed HMMs of the phonemes, and when the phoneme of the segment recognition result is equal to a phoneme of the source speech segment, that speech segment is registered in a segment dictionary.
    Type: Application
    Filed: March 29, 2001
    Publication date: May 2, 2002
    Inventors: Yasuo Okutani, Yasuhiro Komori, Toshiaki Fukada
  • Publication number: 20010047259
    Abstract: Input text data undergoes language analysis to generate prosody, and a speech database is searched for a synthesis unit on the basis of the prosody. A modification distortion of the found synthesis unit, and concatenation distortions upon connecting that synthesis unit to those in the preceding phoneme are computed, and a distortion determination unit weights the modification and concatenation distortions to determine the total distortion. An Nbest determination unit obtains N best paths that can minimize the distortion using the A* search algorithm, and a registration unit determination unit selects a synthesis unit to be registered in a synthesis unit inventory on the basis of the N best paths in the order of frequencies of occurrence, and registers it in the synthesis unit inventory.
    Type: Application
    Filed: March 28, 2001
    Publication date: November 29, 2001
    Inventors: Yasuo Okutani, Yasuhiro Komori
  • Publication number: 20010032079
    Abstract: An object of the present invention is to suppress degradation of the quality in speech synthesis by selecting synthesis units so as to minimize a distortion caused by concatenation distortions and modification distortions. For that purpose, speech synthesis is performed by extracting a plurality of synthesis units corresponding to a phoneme environment from a synthesis-unit holding unit for holding a plurality of synthesis units so as to correspond to a predetermined prosody environment, calculating a distortion of each of the plurality of extracted synthesis units, obtaining a minimum distortion within a predetermined interval determined based on the prosody environment, selecting a series of synthesis units providing a minimum-distortion path, and modifying and concatenating the synthesis units.
    Type: Application
    Filed: March 28, 2001
    Publication date: October 18, 2001
    Inventors: Yasuo Okutani, Yasuhiro Komori
  • Patent number: 6021388
    Abstract: A speech synthesis apparatus for outputting synthesized speech on the basis of a parameter sequence of a speech waveform includes a parameter generation unit which generates a parameter sequence for speech synthesis on the basis of a character sequence input by a character sequence input unit, and stores the generated parameter sequence in a parameter storage unit. A waveform generation unit is also provided that generates pitch waveforms each for one pitch period on the basis of synthesis parameters and pitch scales included in the parameter sequence, and generates a speech waveform by connecting the generated pitch waveforms in accordance with frame lengths set by a frame length setting unit.
    Type: Grant
    Filed: December 19, 1997
    Date of Patent: February 1, 2000
    Assignee: Canon Kabushiki Kaisha
    Inventors: Mitsuru Otsuka, Yasunori Ohora, Takashi Aso, Yasuo Okutani