Patents by Inventor Kouichirou Mori

Kouichirou Mori has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10978076
    Abstract: A speaker retrieval device includes a first converting unit, a receiving unit, and a searching unit. The first converting unit converts, using an inverse transform model of a first conversion model for converting score vectors representing the features of voice quality into acoustic models, pre-registered acoustic models into score vectors; and registers the score vectors in a corresponding manner to a speaker identifier in score management information. The receiving unit receives input of a score vector. The searching unit searches the score management information for the speaker identifiers whose score vectors are similar to the received score vector.
    Type: Grant
    Filed: September 17, 2019
    Date of Patent: April 13, 2021
    Assignees: Kabushiki Kaisha Toshiba, Toshiba Digital Solutions Corporation
    Inventors: Kouichirou Mori, Masaru Suzuki, Yamato Ohtani, Masahiro Morita
  • Patent number: 10930264
    Abstract: A voice quality preference learning device according to an embodiment includes a storage, a user interface system, and a learning processor. The storage stores a plurality of acoustic models. The user interface system receives an operation input indicating a voice quality preference of a user for voice quality. The learning processor learns a preference model corresponding to the voice quality preference of the user based at least in part on the operation input, the operation input associated with a voice quality space, wherein the voice quality space is obtained by dimensionally reducing the plurality of acoustic models.
    Type: Grant
    Filed: February 8, 2017
    Date of Patent: February 23, 2021
    Assignees: Kabushiki Kaisha Toshiba, Toshiba Digital Solutions Corporation
    Inventor: Kouichirou Mori
  • Patent number: 10872597
    Abstract: A speech synthesis dictionary delivery device that delivers a dictionary for performing speech synthesis to terminals, comprises a storage device for speech synthesis dictionary database that stores a first dictionary which includes an acoustic model of a speaker and is associated with identification information of the speaker, that stores a second dictionary which includes an acoustic model generated using voice data of a plurality of speakers, and that stores parameter sets of the speakers to be used with the second dictionary and which are associated with identification information of the speakers, a processor that determines one of the first dictionary and the second dictionary, which should be used in the terminal for a specified speaker, and an input output interface (I/F) that receives the identification information of a speaker transmitted from the terminal and then delivers at least one of a first dictionary, the second dictionary, and a parameter set of the second dictionary, on the basis of the rec
    Type: Grant
    Filed: August 8, 2018
    Date of Patent: December 22, 2020
    Assignees: Kabushiki Kaisha Toshiba, Toshiba Digital Solutions Cornoration
    Inventors: Kouichirou Mori, Gou Hirabayashi, Masahiro Morita, Yamato Ohtani
  • Publication number: 20200066250
    Abstract: A speech synthesis device according to an embodiment includes a speech synthesizing unit, a speaker parameter storing unit, an availability determining unit, and a speaker parameter control unit. Based on a speaker parameter value representing a set of values of parameters related to the speaker individuality, the speech synthesizing unit is capable of controlling the speaker individuality of synthesized speech. The speaker parameter storing unit is used to store already-registered speaker parameter values. Based on the result of comparing an input speaker parameter value with each already-registered speaker parameter value, the availability determining unit determines the availability of the input speaker parameter value. The speaker parameter control unit prohibits or restricts the use of the input speaker parameter value that is determined to be unavailable by the availability determining unit.
    Type: Application
    Filed: September 5, 2019
    Publication date: February 27, 2020
    Applicant: TOSHIBA DIGITAL SOLUTIONS CORPORATION
    Inventors: Masahiro MORITA, Kouichirou MORI, Yamato OHTANI
  • Patent number: 10540956
    Abstract: According to one embodiment, a training apparatus for speech synthesis includes a storage device and a hardware processor in communication with the storage device. The storage stores an average voice model, training speaker information representing a feature of speech of a training speaker and perception representation information represented by scores of one or more perception representations related to voice quality of the training speaker, the average voice model constructed by utilizing acoustic data extracted from speech waveforms of a plurality of speakers and language data. The hardware processor, based at least in part on the average voice model, the training speaker information, and the perception representation score, train one or more perception representation acoustic models corresponding to the one or more perception representations.
    Type: Grant
    Filed: September 6, 2016
    Date of Patent: January 21, 2020
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Yamato Ohtani, Kouichirou Mori
  • Patent number: 10535335
    Abstract: According to one embodiment, a voice synthesizing device includes a first operation receiving unit, a score transforming unit, and a voice synthesizing unit. The first operation receiving unit configured to receive a first operation specifying voice quality of a desired voice based on one or more upper level expressions indicating the voice quality. The score transforming unit configured to transform, based on a score transformation model that transforms a score of the upper level expression into a score of a lower level expression which is less abstract than the upper level expression, the score of the upper level expression corresponding to the first operation into a score of one or more lower level expressions. The voice synthesizing unit configured to generate a synthetic sound corresponding to a certain text based on the score of the lower level expression.
    Type: Grant
    Filed: September 2, 2016
    Date of Patent: January 14, 2020
    Assignees: Kabushiki Kaisha Toshiba, Toshiba Digital Solutions Corporation
    Inventors: Kouichirou Mori, Yamato Ohtani
  • Publication number: 20200013409
    Abstract: A speaker retrieval device includes a first converting unit, a receiving unit, and a searching unit. The first converting unit converts, using an inverse transform model of a first conversion model for converting score vectors representing the features of voice quality into acoustic models, pre-registered acoustic models into score vectors; and registers the score vectors in a corresponding manner to a speaker identifier in score management information. The receiving unit receives input of a score vector. The searching unit searches the score management information for the speaker identifiers whose score vectors are similar to the received score vector.
    Type: Application
    Filed: September 17, 2019
    Publication date: January 9, 2020
    Applicants: Kabushiki Kaisha Toshiba, Toshiba Digital Solutions Corporation
    Inventors: Kouichirou Mori, Masaru Suzuki, Yamato Ohtani, Masahiro Morita
  • Publication number: 20190066656
    Abstract: A speech synthesis dictionary delivery device that delivers a dictionary for performing speech synthesis to terminals, comprises a storage device for speech synthesis dictionary database that stores a first dictionary which includes an acoustic model of a speaker and is associated with identification information of the speaker, that stores a second dictionary which includes an acoustic model generated using voice data of a plurality of speakers, and that stores parameter sets of the speakers to be used with the second dictionary and which are associated with identification information of the speakers, a processor that determines one of the first dictionary and the second dictionary, which should be used in the terminal for a specified speaker, and an input output interface (I/F) that receives the identification information of a speaker transmitted from the terminal and then delivers at least one of a first dictionary, the second dictionary, and a parameter set of the second dictionary, on the basis of the rec
    Type: Application
    Filed: August 8, 2018
    Publication date: February 28, 2019
    Applicants: Kabushiki Kaisha Toshiba, Toshiba Digital Solutions Corporation
    Inventors: Kouichirou MORI, Gou HIRABAYASHI, Masahiro MORITA, Yamato OHTANI
  • Publication number: 20170270907
    Abstract: A voice quality preference learning device according to an embodiment includes a storage, a user interface system, and a learning processor. The storage stores a plurality of acoustic models. The user interface system receives an operation input indicating a voice quality preference of a user for voice quality. The learning processor learns a preference model corresponding to the voice quality preference of the user based at least in part on the operation input, the operation input associated with a voice quality space, wherein the voice quality space is obtained by dimensionally reducing the plurality of acoustic models.
    Type: Application
    Filed: February 8, 2017
    Publication date: September 21, 2017
    Inventor: Kouichirou MORI
  • Patent number: 9626338
    Abstract: According to one embodiment, a markup assistance apparatus includes an acquisition unit, a first calculation unit, a detection unit and a presentation unit. The acquisition unit acquires a feature amount for respective tags, each of the tags being used to control text-to-speech processing of a markup text. The first calculation unit calculates, for respective character strings, a variance of feature amounts of the tags which are assigned to the character string in a markup text. The detection unit detects a first character string assigned a first tag having the variance not less than a first threshold value as a first candidate including the tag to be corrected. The presentation unit presents the first candidate.
    Type: Grant
    Filed: January 15, 2015
    Date of Patent: April 18, 2017
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kouichirou Mori, Masahiro Morita
  • Patent number: 9601106
    Abstract: According to one embodiment, a prosody editing apparatus includes a storage, a first selection unit, a search unit, a normalization unit, a mapping unit, a display, a second selection unit, a restoring unit and a replacing unit. The search unit searches the storage for one or more second prosodic patterns corresponding to attribute information that matches attribute information of the selected phrase. The mapping maps each of the normalized second prosodic patterns on a low-dimensional space. The restoring unit restores a restored prosodic pattern according to the selected coordinates. The replacing unit replaces prosody of synthetic speech generated based on the selected phrase by the restored prosodic pattern.
    Type: Grant
    Filed: August 15, 2013
    Date of Patent: March 21, 2017
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kouichirou Mori, Takehiko Kagoshima, Masahiro Morita
  • Publication number: 20170076714
    Abstract: According to one embodiment, a voice synthesizing device includes a first operation receiving unit, a score transforming unit, and a voice synthesizing unit. The first operation receiving unit configured to receive a first operation specifying voice quality of a desired voice based on one or more upper level expressions indicating the voice quality. The score transforming unit configured to transform, based on a score transformation model that transforms a score of the upper level expression into a score of a lower level expression which is less abstract than the upper level expression, the score of the upper level expression corresponding to the first operation into a score of one or more lower level expressions. The voice synthesizing unit configured to generate a synthetic sound corresponding to a certain text based on the score of the lower level expression.
    Type: Application
    Filed: September 2, 2016
    Publication date: March 16, 2017
    Inventors: Kouichirou MORI, Yamato OHTANI
  • Publication number: 20170076715
    Abstract: According to one embodiment, a training apparatus for speech synthesis includes a storage device and a hardware processor in communication with the storage device. The storage stores an average voice model, training speaker information representing a feature of speech of a training speaker and perception representation information represented by scores of one or more perception representations related to voice quality of the training speaker, the average voice model constructed by utilizing acoustic data extracted from speech waveforms of a plurality of speakers and language data. The hardware processor, based at least in part on the average voice model, the training speaker information, and the perception representation score, train one or more perception representation acoustic models corresponding to the one or more perception representations.
    Type: Application
    Filed: September 6, 2016
    Publication date: March 16, 2017
    Inventors: Yamato OHTANI, Kouichirou MORI
  • Patent number: 9466225
    Abstract: According to one embodiment, a speech learning apparatus includes a detection unit, a first calculation unit, a generation unit, an addition unit and a speech synthesis unit. The first calculation unit calculates a score indicating a degree of emphasis of a keyword based on a type of the marker and a manner of selecting the keyword. The generation unit generates a synthesis parameter to determine a degree of reading of the keyword in accordance with the score and the type of the marker. The addition unit adds to the keyword a tag for reading the keyword in accordance with the synthesis parameter. The speech synthesis unit generates synthesized speech obtained by synthesizing speech of the keyword in accordance with the tag.
    Type: Grant
    Filed: August 14, 2013
    Date of Patent: October 11, 2016
    Assignee: Kabushiki Kaisha Tosihba
    Inventors: Kouichirou Mori, Masahiro Morita
  • Patent number: 9280967
    Abstract: According to one embodiment, an apparatus for supporting reading of a document includes a model storage unit, a document acquisition unit, a feature information extraction, and an utterance style estimation unit. The model storage unit is configured to store a model which has trained a correspondence relationship between first feature information and an utterance style. The first feature information is extracted from a plurality of sentences in a training document. The document acquisition unit is configured to acquire a document to be read. The feature information extraction unit is configured to extract second feature information from each sentence in the document to be read. The utterance style estimation unit is configured to compare the second feature information of a plurality of sentences in the document to be read with the model, and to estimate an utterance style of the each sentence of the document to be read.
    Type: Grant
    Filed: September 14, 2011
    Date of Patent: March 8, 2016
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kosei Fume, Masaru Suzuki, Masahiro Morita, Kentaro Tachibana, Kouichirou Mori, Yuji Shimizu, Takehiko Kagoshima, Masatsune Tamura, Tomohiro Yamasaki
  • Patent number: 9076069
    Abstract: An adding metadata apparatus includes a first acquisition unit which acquires a first image, first metadata, and a first position within the first image, the first position being for displaying the first metadata; an extraction unit which extracts local features from the first image; a calculation unit which searches for a group of the local features within a predetermined distance from the first position, and calculates a representative point of the group; a search unit which matches the first image with a plurality of images stored in a database by using the local features, and searches for a second image which coincides with the local features; and a registration unit which calculates a second position, within the second image, corresponding to the representative point, and registers the second position and the first metadata as metadata for the second image.
    Type: Grant
    Filed: March 14, 2011
    Date of Patent: July 7, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Kouichirou Mori
  • Publication number: 20150128026
    Abstract: According to one embodiment, a markup assistance apparatus includes an acquisition unit, a first calculation unit, a detection unit and a presentation unit. The acquisition unit acquires a feature amount for respective tags, each of the tags being used to control text-to-speech processing of a markup text. The first calculation unit calculates, for respective character strings, a variance of feature amounts of the tags which are assigned to the character string in a markup text. The detection unit detects a first character string assigned a first tag having the variance not less than a first threshold value as a first candidate including the tag to be corrected. The presentation unit presents the first candidate.
    Type: Application
    Filed: January 15, 2015
    Publication date: May 7, 2015
    Inventors: Kouichirou Mori, Masahiro Morita
  • Publication number: 20150081306
    Abstract: According to an embodiment, a prosody editing device includes an approximate contour generator, a setter, a display controller, an operation receiver, and an updater. The approximate contour generator approximates a contour representing a time series of prosody information with a parametric curve including a control point to generate an approximate contour. The setter sets, on the approximate contour, an operation point corresponding to the control point. The display controller displays, on a display device, an operation screen including the approximate contour on which the operation point is shown. The operation receiver receives an operation to move the operation point optionally selected on the operation screen. The updater calculates a position of the control point from a moving amount of the operation point and updates the approximate contour.
    Type: Application
    Filed: September 2, 2014
    Publication date: March 19, 2015
    Inventors: Kouichirou MORI, Yu NASU, Masatsune TAMURA, Masahiro MORITA
  • Patent number: 8965769
    Abstract: According to one embodiment, a markup assistance apparatus includes an acquisition unit, a first calculation unit, a detection unit and a presentation unit. The acquisition unit acquires a feature amount for respective tags, each of the tags being used to control text-to-speech processing of a markup text. The first calculation unit calculates, for respective character strings, a variance of feature amounts of the tags which are assigned to the character string in a markup text. The detection unit detects a first character string assigned a first tag having the variance not less than a first threshold value as a first candidate including the tag to be corrected. The presentation unit presents the first candidate.
    Type: Grant
    Filed: September 24, 2012
    Date of Patent: February 24, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kouichirou Mori, Masahiro Morita
  • Patent number: 8789080
    Abstract: A viewing behavior learning apparatus includes: a viewing history acquiring unit that acquires a viewing record that indicates an attribute of a program and start and end times during which a viewer viewed the program; a viewing history dividing unit that divides the viewing record every discretized time points that is discretized by a unit time; a viewing history replicating unit that discretizes a viewing time period denoted by the divided viewing records by interval shorter than the unit time to obtain discretized viewing time periods; a viewing behavior storage unit that builds a model of the viewing behavior by use of a Bayesian network having the viewing time period and the attribute as random variables and stores a conditional probability table of the Bayesian network; and a viewing behavior updating unit that updates the conditional probability table using the discretized viewing time periods.
    Type: Grant
    Filed: September 23, 2008
    Date of Patent: July 22, 2014
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kouichirou Mori, Tomoko Murakami, Ryohei Orihara