Patents by Inventor Nobuhide Yamazaki

Nobuhide Yamazaki has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7984076
    Abstract: The text format of input data is checked, and is converted into a system-manipulated format. It is further determined if the input data is in an HTML or e-mail format using tags, heading information, and the like. The converted data is divided into blocks in a simple manner such that elements in the blocks can be checked based on repetition of predetermined character patterns. Each block section is tagged with a tag indicating a block. The data divided into blocks is parsed based on tags, character patterns, etc., and is structured. A table in text is also parsed, and is segmented into cells. Finally, tree-structured data having a hierarchical structure is generated based on the sentence-structured data. A sentence-extraction template paired with the tree-structured data is used to extract sentences.
    Type: Grant
    Filed: December 28, 2007
    Date of Patent: July 19, 2011
    Assignee: Sony Corporation
    Inventors: Kenichiro Kobayashi, Makoto Akabane, Tomoaki Nitta, Nobuhide Yamazaki, Erika Kobayashi
  • Patent number: 7765103
    Abstract: A rule based speech synthesis apparatus by which concatenation distortion may be less than a preset value without dependency on utterance, wherein a parameter correction unit reads out a target parameter for a vowel from a target parameter storage, responsive to the phoneme at a leading end and at a trailing end of a speech element and acoustic feature parameters output from a speech element selector, and accordingly corrects the acoustic feature parameters of the speech element. The parameter correction unit corrects the parameters, so that the parameters ahead and behind the speech element are equal to the target parameter for the vowel of the corresponding phoneme, and outputs the corrected parameters.
    Type: Grant
    Filed: June 9, 2004
    Date of Patent: July 27, 2010
    Assignee: Sony Corporation
    Inventor: Nobuhide Yamazaki
  • Patent number: 7596497
    Abstract: A speech synthesis apparatus and a speech synthesis method, in which a waveform of a desired formant shape may be generated with a small volume of computing operations. A voiced sound generating unit of the speech synthesis apparatus includes n single formant generating units, an adder for summing these outputs to generate a one-pitch waveform, a one-pitch buffer unit, and a waveform overlapping unit for overlapping a number of the one-pitch waveforms as the one-pitch waveform is shifted by one pitch period each time. Each single formant generating unit is supplied with three parameters, namely a center frequency of a formant representing the formant position, a formant bandwidth, and a formant gain and reads out the band characteristics waveform at a readout interval, derived from the bandwidth wn, from a band characteristics waveform storage unit to effect expansion along the time axis.
    Type: Grant
    Filed: June 7, 2004
    Date of Patent: September 29, 2009
    Assignee: Sony Corporation
    Inventor: Nobuhide Yamazaki
  • Publication number: 20080256120
    Abstract: The text format of input data is checked, and is converted into a system-manipulated format. It is further determined if the input data is in an HTML or e-mail format using tags, heading information, and the like. The converted data is divided into blocks in a simple manner such that elements in the blocks can be checked based on repetition of predetermined character patterns. Each block section is tagged with a tag indicating a block. The data divided into blocks is parsed based on tags, character patterns, etc., and is structured. A table in text is also parsed, and is segmented into cells. Finally, tree-structured data having a hierarchical structure is generated based on the sentence-structured data. A sentence-extraction template paired with the tree-structured data is used to extract sentences.
    Type: Application
    Filed: December 28, 2007
    Publication date: October 16, 2008
    Applicant: Sony Corporation
    Inventors: Kenichiro Kobayashi, Makoto Akabane, Tomoaki Nitta, Nobuhide Yamazaki, Erika Kobayashi
  • Patent number: 7412390
    Abstract: The emotion is to be added to the synthesized speech as the prosodic feature of the language is maintained. In a speech synthesis device 200, a language processor 201 generates a string of pronunciation marks from the text, and a prosodic data generating unit 202 creates prosodic data, expressing the time duration, pitch, sound volume or the like parameters of phonemes, based on the string of pronunciation marks. A constraint information generating unit 203 is fed with the prosodic data and with the string of pronunciation marks to generate the constraint information which limits the changes in the parameters to add the so generated constraint information to the prosodic data. A emotion filter 204, fed with the prosodic data, to which has been added the constraint information, changes the parameters of the prosodic data, within the constraint, responsive to the feeling state information, imparted to it.
    Type: Grant
    Filed: March 13, 2003
    Date of Patent: August 12, 2008
    Assignees: Sony France S.A., Sony Corporation
    Inventors: Erika Kobayashi, Toshiyuki Kumakura, Makoto Akabane, Kenichiro Kobayashi, Nobuhide Yamazaki, Tomoaki Nitta, Pierre Yves Oudeyer
  • Patent number: 7379871
    Abstract: Various sensors detect conditions outside a robot and an operation applied to the robot, and output the results of detection to a robot-motion-system control section. The robot-motion-system control section determines a behavior state according to a behavior model. A robot-thinking-system control section determines an emotion state according to an emotion model. A speech-synthesizing-control-information selection section determines a field on a speech-synthesizing-control-information table according to the behavior state and the emotion state. A language processing section analyzes in grammar a text for speech synthesizing sent from the robot-thinking-system control section, converts a predetermined portion according to a speech-synthesizing control information, and outputs to a rule-based speech synthesizing section. The rule-based speech synthesizing section synthesizes a speech signal corresponding to the text for speech synthesizing.
    Type: Grant
    Filed: December 27, 2000
    Date of Patent: May 27, 2008
    Assignee: Sony Corporation
    Inventors: Masato Shimakawa, Nobuhide Yamazaki, Erika Kobayashi, Makoto Akabane, Kenichiro Kobayashi, Keiichi Yamada, Tomoaki Nitta
  • Patent number: 7315867
    Abstract: The text format of input data is checked, and is converted into a system-manipulated format. It is further determined if the input data is in an HTML or e-mail format using tags, heading information, and the like. The converted data is divided into blocks in a simple manner such that elements in the blocks can be checked based on repetition of predetermined character patterns. Each block section is tagged with a tag indicating a block. The data divided into blocks is parsed based on tags, character patterns, etc., and is structured. A table in text is also parsed, and is segmented into cells. Finally, tree-structured data having a hierarchical structure is generated based on the sentence-structured data. A sentence-extraction template paired with the tree-structured data is used to extract sentences.
    Type: Grant
    Filed: July 20, 2005
    Date of Patent: January 1, 2008
    Assignee: Sony Corporation
    Inventors: Kenichiro Kobayashi, Makoto Akabane, Tomoaki Nitta, Nobuhide Yamazaki, Erika Kobayashi
  • Patent number: 7111011
    Abstract: The text format of input data is checked, and is converted into a system-manipulated format. It is further determined if the input data is in an HTML or e-mail format using tags, heading information, and the like. The converted data is divided into blocks in a simple manner such that elements in the blocks can be checked based on repetition of predetermined character patterns. Each block section is tagged with a tag indicating a block. The data divided into blocks is parsed based on tags, character patterns, etc., and is structured. A table in text is also parsed, and is segmented into cells. Finally, tree-structured data having a hierarchical structure is generated based on the sentence-structured data. A sentence-extraction template paired with the tree-structured data is used to extract sentences.
    Type: Grant
    Filed: May 10, 2002
    Date of Patent: September 19, 2006
    Assignee: Sony Corporation
    Inventors: Kenichiro Kobayashi, Makoto Akabane, Tomoaki Nitta, Nobuhide Yamazaki, Erika Kobayashi
  • Patent number: 7080015
    Abstract: In a synchronization control apparatus, a voice-language-information generating section generates the voice language information of a word which a robot utters. A voice synthesizing section calculates phoneme information and a phoneme continuation duration according to the voice language information, and also generates synthesized-voice data according to an adjusted phoneme continuation duration. An articulation-operation generating section calculates an articulation-operation period according to the phoneme information. A voice-operation adjusting section adjusts the phoneme continuation duration and the articulation-operation period. An articulation-operation executing section operates an organ of articulation according to the adjusted articulation-operation period.
    Type: Grant
    Filed: August 26, 2004
    Date of Patent: July 18, 2006
    Assignee: Sony Corporation
    Inventors: Keiichi Yamada, Kenichiro Kobayashi, Tomoaki Nitta, Makoto Akabane, Masato Shimakawa, Nobuhide Yamazaki, Erika Kobayashi
  • Patent number: 7062438
    Abstract: A sentence or a singing is to be synthesized with a natural speech close to the human voice. To this end, singing metrical data are formed in a tag processing unit 211 in a singing synthesis unit 212 in a speech synthesis apparatus 200 based on singing data and an analyzed text portion. A language analysis unit 213 performs language processing on text portions other than the singing data. As for a text portion registered in a natural metrical dictionary, as determined by this language processing, corresponding natural metrical data is selected and its parameters are adjusted in a metrical data adjustment unit 222 based on phonemic segment data of a phonemic segment storage unit 223 in the metrical data adjustment unit 222. As for a text portion not registered in the natural metrical dictionary, a phonemic symbol string is generated in a natural metrical dictionary storage unit 214, after which metrical data are generated in a metrical generating unit 221.
    Type: Grant
    Filed: March 13, 2003
    Date of Patent: June 13, 2006
    Assignee: Sony Corporation
    Inventors: Kenichiro Kobayashi, Nobuhide Yamazaki, Makoto Akabane
  • Publication number: 20050251737
    Abstract: The text format of input data is checked, and is converted into a system-manipulated format. It is further determined if the input data is in an HTML or e-mail format using tags, heading information, and the like. The converted data is divided into blocks in a simple manner such that elements in the blocks can be checked based on repetition of predetermined character patterns. Each block section is tagged with a tag indicating a block. The data divided into blocks is parsed based on tags, character patterns, etc., and is structured. A table in text is also parsed, and is segmented into cells. Finally, tree-structured data having a hierarchical structure is generated based on the sentence-structured data. A sentence-extraction template paired with the tree-structured data is used to extract sentences.
    Type: Application
    Filed: July 20, 2005
    Publication date: November 10, 2005
    Applicant: Sony Corporation
    Inventors: Kenichiro Kobayashi, Makoto Akabane, Tomoaki Nitta, Nobuhide Yamazaki, Erika Kobayashi
  • Publication number: 20050119889
    Abstract: A rule based speech synthesis apparatus by which concatenation distortion may be less than a preset value without dependency on utterance, wherein a parameter correction unit reads out a target parameter for a vowel from a target parameter storage, responsive to the phoneme at the a leading end and at a trailing end of a speech element and acoustic feature parameters output from a speech element selector, and accordingly corrects the acoustic feature parameters of the speech element. The parameter correction unit corrects the parameters, so that the parameters ahead and behind the speech element are equal to the target parameter for the vowel of the corresponding phoneme, and outputs the so corrected parameters.
    Type: Application
    Filed: June 9, 2004
    Publication date: June 2, 2005
    Inventor: Nobuhide Yamazaki
  • Patent number: 6865535
    Abstract: In a synchronization control apparatus, a voice-language-information generating section generates the voice language information of a word which a robot utters. A voice synthesizing section calculates phoneme information and a phoneme continuation duration according to the voice language information, and also generates synthesized-voice data according to an adjusted phoneme continuation duration. An articulation-operation generating section calculates an articulation-operation period according to the phoneme information. A voice-operation adjusting section adjusts the phoneme continuation duration and the articulation-operation period. An articulation-operation executing section operates an organ of articulation according to the adjusted articulation-operation period.
    Type: Grant
    Filed: December 27, 2000
    Date of Patent: March 8, 2005
    Assignee: Sony Corporation
    Inventors: Keiichi Yamada, Kenichiro Kobayashi, Tomoaki Nitta, Makoto Akabane, Masato Shimakawa, Nobuhide Yamazaki, Erika Kobayashi
  • Publication number: 20050027540
    Abstract: In a synchronization control apparatus, a voice-language-information generating section generates the voice language information of a word which a robot utters. A voice synthesizing section calculates phoneme information and a phoneme continuation duration according to the voice language information, and also generates synthesized-voice data according to an adjusted phoneme continuation duration. An articulation-operation generating section calculates an articulation-operation period according to the phoneme information. A voice-operation adjusting section adjusts the phoneme continuation duration and the articulation-operation period. An articulation-operation executing section operates an organ of articulation according to the adjusted articulation-operation period.
    Type: Application
    Filed: August 26, 2004
    Publication date: February 3, 2005
    Inventors: Keiichi Yamada, Kenichiro Kobayashi, Tomoaki Nitta, Makoto Akabane, Masato Shimakawa, Nobuhide Yamazaki, Erika Kobayashi
  • Publication number: 20050010414
    Abstract: A speech synthesis apparatus and a speech synthesis method, in which a waveform of a desired formant shape may be generated with a small volume of computing operations. A voiced sound generating unit of the speech synthesis apparatus includes n single formant generating units, an adder for summing these outputs to generate a one-pitch waveform, a one-pitch buffer unit, and a waveform overlapping unit for overlapping a number of the one-pitch waveforms as the one-pitch waveform is shifted by one pitch period each time. Each single formant generating unit is supplied with three parameters, namely a center frequency of a formant representing the formant position, a formant bandwidth, and a formant gain and reads out the band characteristics waveform at a readout interval, derived from the bandwidth wn, from a band characteristics waveform storage unit to effect expansion along the time axis.
    Type: Application
    Filed: June 7, 2004
    Publication date: January 13, 2005
    Inventor: Nobuhide Yamazaki
  • Publication number: 20040019484
    Abstract: The emotion is to be added to the synthesized speech as the prosodic feature of the language is maintained. In a speech synthesis device 200, a language processor 201 generates a string of pronunciation marks from the text, and a prosodic data generating unit 202 creates prosodic data, expressing the time duration, pitch, sound volume or the like parameters of phonemes, based on the string of pronunciation marks. A constraint information generating unit 203 is fed with the prosodic data and with the string of pronunciation marks to generate the constraint information which limits the changes in the parameters to add the so generated constraint information to the prosodic data. A emotion filter 204, fed with the prosodic data, to which has been added the constraint information, changes the parameters of the prosodic data, within the constraint, responsive to the feeling state information, imparted to it.
    Type: Application
    Filed: March 13, 2003
    Publication date: January 29, 2004
    Inventors: Erika Kobayashi, Toshiyuki Kumakura, Makoto Akabane, Kenichiro Kobayashi, Nobuhide Yamazaki, Tomoaki Nitta, Pierre Yves Oudeyer
  • Publication number: 20040019485
    Abstract: A sentence or a singing is to be synthesized with a natural speech close to the human voice. To this end, singing metrical data are formed in a tag processing unit 211 in a singing synthesis unit 212 in a speech synthesis apparatus 200 based on singing data and an analyzed text portion. A language analysis unit 213 performs language processing on text portions other than the singing data. As for a text portion registered in a natural metrical dictionary, as determined by this language processing, corresponding natural metrical data is selected and its parameters are adjusted in a metrical data adjustment unit 222 based on phonemic segment data of a phonemic segment storage unit 223 in the metrical data adjustment unit 222. As for a text portion not registered in the natural metrical dictionary, a phonemic symbol string is generated in a natural metrical dictionary storage unit 214, after which metrical data are generated in a metrical generating unit 221.
    Type: Application
    Filed: March 13, 2003
    Publication date: January 29, 2004
    Inventors: Kenichiro Kobayashi, Nobuhide Yamazaki, Makoto Akabane
  • Publication number: 20030163320
    Abstract: The present invention relates to a speech synthesis apparatus for generating an emotionally expressive synthesized voice. The emotionally expressive synthesized voice can be generated by generating a synthesized voice with a tone being changed in accordance with an emotional state. A parameter generator 43 generates transform parameters and synthesis control parameters on the basis of state information indicating the emotional state of a pet robot. A data transformer 44 transforms the frequency characteristics of phonemic unit data as speech information. A waveform generator 42 obtains necessary phonemic unit data on the basis of phoneme information included in a text analysis result, processes and connects the phonemic unit data with one another on the basis of prosody data and the synthesis control parameters, and generates synthesized voice data with the corresponding prosody and tone. The present invention is applicable to robots for outputting synthesized voices.
    Type: Application
    Filed: April 24, 2003
    Publication date: August 28, 2003
    Inventors: Nobuhide Yamazaki, Kenichiro Kobayashi, Yasuharu Asano, Shinichi Kariya, Yaeko Fuita
  • Publication number: 20030007397
    Abstract: The text format of input data is checked, and is converted into a system-manipulated format. It is further determined if the input data is in an HTML or e-mail format using tags, heading information, and the like. The converted data is divided into blocks in a simple manner such that elements in the blocks can be checked based on repetition of predetermined character patterns. Each block section is tagged with a tag indicating a block. The data divided into blocks is parsed based on tags, character patterns, etc., and is structured. A table in text is also parsed, and is segmented into cells. Finally, tree-structured data having a hierarchical structure is generated based on the sentence-structured data. A sentence-extraction template paired with the tree-structured data is used to extract sentences.
    Type: Application
    Filed: May 10, 2002
    Publication date: January 9, 2003
    Inventors: Kenichiro Kobayashi, Makoto Akabane, Tomoaki Nitta, Nobuhide Yamazaki, Erika Kobayashi
  • Publication number: 20010021907
    Abstract: Various sensors detect conditions outside a robot and an operation applied to the robot, and output the results of detection to a robot-motion-system control section. The robot-motion-system control section determines a behavior state according to a behavior model. A robot-thinking-system control section determines an emotion state according to an emotion model. A speech-synthesizing-control-information selection section determines a field on a speech-synthesizing-control-information table according to the behavior state and the emotion state. A language processing section analyzes in grammar a text for speech synthesizing sent from the robot-thinking-system control section, converts a predetermined portion according to a speech-synthesizing control information, and outputs to a rule-based speech synthesizing section. The rule-based speech synthesizing section synthesizes a speech signal corresponding to the text for speech synthesizing.
    Type: Application
    Filed: December 27, 2000
    Publication date: September 13, 2001
    Inventors: Masato Shimakawa, Nobuhide Yamazaki, Erika Kobayashi, Makoto Akabane, Kenichiro Kobayashi, Keiichi Yamada, Tomoaki Nitta