Patents by Inventor Ryo MASUMURA

Ryo MASUMURA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11894017
    Abstract: A voice/non-voice determination device robust with respect to an acoustic signal in a high-noise environment is provided. The voice/non-voice determination device includes an acoustic scene classification unit including a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected, a speech enhancement unit including a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement, and a voice/non-voice determination unit including a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section.
    Type: Grant
    Filed: July 25, 2019
    Date of Patent: February 6, 2024
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo Masumura, Takanobu Oba, Kiyoaki Matsui
  • Patent number: 11887620
    Abstract: The present invention improves the accuracy of language prediction. A history speech meta-information understanding unit 11 obtains a history speech meta-information vector from a word string of a preceding speech using a meta-information understanding device. A history speech embedding unit 12 converts the word string of the preceding speech and a speaker label into a history speech embedding vector. A speech unit combination vector construction unit 13 obtains a speech unit combination vector by combining the history speech meta-information vector and the history speech embedding vector. A speech sequence embedding vector calculation unit 14 converts a plurality of speech unit combination vectors obtained for the past speech sequences to a speech sequence embedding vector. A language model score calculation unit 15 calculates a language model score of a current speech from a word string of the current speech, a speaker label, and a speech sequence embedding vector.
    Type: Grant
    Filed: January 27, 2020
    Date of Patent: January 30, 2024
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo Masumura, Tomohiro Tanaka, Takanobu Oba
  • Publication number: 20230245675
    Abstract: To highly accurately estimate an environment in which an acoustic signal is collected without inputting auxiliary information. An input circuitry (21) inputs a target acoustic signal, which is an estimation target. An estimation circuitry (22) correlates an acoustic signal and an explanatory text for explaining the acoustic signal to estimate an environment in which the target acoustic signal is collected. The environment is an explanatory text for explaining the target acoustic signal obtained by the correlation. The correlation is so trained as to minimize a difference between an explanatory text assigned to the acoustic signal and an explanatory text obtained from the acoustic signal by the correlation.
    Type: Application
    Filed: May 11, 2020
    Publication date: August 3, 2023
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Yuma KOIZUMI, Ryo MASUMURA, Shoichiro SAITO
  • Publication number: 20230206118
    Abstract: Provided is a model learning technology to learn a model in consideration of a difference in label assignment accuracy between experts and non-experts.
    Type: Application
    Filed: March 19, 2020
    Publication date: June 29, 2023
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Hosana KAMIYAMA, Yuki KITAGISHI, Atsushi ANDO, Ryo MASUMURA, Takeshi MORI, Satoshi KOBASHIKAWA
  • Publication number: 20230202030
    Abstract: Provided is a work system including: an object imaging unit configured to acquire an object image by photographing an object from a work direction; a work position acquisition unit configured to acquire a work position based on an existence region of the object obtained from a machine learning model; and a work unit configured to execute work on the object based on a work position obtained by inputting the object image to the work position acquisition unit.
    Type: Application
    Filed: February 28, 2023
    Publication date: June 29, 2023
    Applicant: Kabushiki Kaisha Yaskawa Denki
    Inventors: Ryo MASUMURA, Wataru WATANABE
  • Publication number: 20230134186
    Abstract: Provided a machine learning data generation device including: at least one processor; and at least one memory device that stores a plurality of instructions which, when executed by the at least one processor, causes the at least one processor to execute: acquiring , in association with a predetermined label actual time series information; executing physical simulation of generating a plurality of pieces of virtual time series information; identifying parameter values based on the plurality of pieces of virtual time series information and the actual time series information, and to associate the identified parameter values with the label; generating a new parameter value and the label based on the identified parameter values; generating virtual time series information corresponding to a new internal state by executing physical simulation through use of the new parameter value; and generating new machine learning data.
    Type: Application
    Filed: December 26, 2022
    Publication date: May 4, 2023
    Inventors: Ryohei SUZUKI, Tsuyoshi Yokoya, Ryo Masumura, Hiroki Tachikake
  • Publication number: 20230108419
    Abstract: A learning system includes real environment image acquisition circuitry, virtual environment image generation circuitry, and GAN learning circuitry. The real environment image acquisition circuitry is configured to acquire a real environment image indicating a real environment in which real objects and a real background are provided. The virtual environment image generation circuitry is configured to generate a virtual environment image indicating a virtual environment in which virtual objects and a virtual background are provided. The virtual environment image includes at least one of the virtual background and the virtual objects which have a different color or different colors different from colors of the real background and the real objects. The GAN learning circuitry is configured to perform GAN (Generative Adversarial Networks) learning via which the virtual environment image is got more similar to the real environment image based on the real environment image and the virtual environment image.
    Type: Application
    Filed: October 5, 2022
    Publication date: April 6, 2023
    Applicant: KABUSHIKI KAISHA YASKAWA DENKI
    Inventors: Makoto MORI, Ryo MASUMURA
  • Publication number: 20230072015
    Abstract: Information corresponding to a t-th word string Yt of a second text, which is a conversion result of a t-th word string Xt of a first text is estimated on the basis of a model parameter ?, by using, as inputs, a t-th word string Xt of the first text and a sequence Y{circumflex over (?)}1, . . . , Y{circumflex over (?)}t-1 of first to (t?1)-th word strings of the second text, which is a conversion result of a sequence X1, . . . Xt-1 of first to (t?1)-th word strings of the first text. Here, t is an integer of two or greater.
    Type: Application
    Filed: February 20, 2020
    Publication date: March 9, 2023
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Mana IHORI, Ryo MASUMURA
  • Patent number: 11568761
    Abstract: The present invention provides a pronunciation error detection apparatus capable of following a text without the need for a correct sentence even when erroneous recognition such as a reading error occurs.
    Type: Grant
    Filed: September 13, 2018
    Date of Patent: January 31, 2023
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Satoshi Kobashikawa, Ryo Masumura, Hosana Kamiyama, Yusuke Ijima, Yushi Aono
  • Patent number: 11556783
    Abstract: There is provided a technique for transforming a confusion network to a representation that can be used as an input for machine learning. A confusion network distributed representation sequence generating part that generates a confusion network distributed representation sequence, which is a vector sequence, from an arc word set sequence and an arc weight set sequence constituting the confusion network is included.
    Type: Grant
    Filed: August 21, 2018
    Date of Patent: January 17, 2023
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo Masumura, Hirokazu Masataki
  • Publication number: 20230004870
    Abstract: Provided is a machine learning model determination system including: at least one server and at least one client terminal; an evaluation information database which stores evaluation information being information on an evaluation of machine learning; an evaluation information update module which updates the evaluation information based on a specific value of a parameter and an evaluation of the machine learning through use of specific teaching data; a teaching data input module; a verification data input module; a parameter determination module which determines the specific value of the parameter based on the evaluation information; and a machine learning engine which includes a learning module which executes learning for a machine learning model through use of the specific teaching data, and an evaluation module which evaluates a result of the machine learning through use of the specific verification data.
    Type: Application
    Filed: September 9, 2022
    Publication date: January 5, 2023
    Inventors: Masaru ADACHI, Tsuyoshi YOKOYA, Ryo MASUMURA
  • Publication number: 20220406093
    Abstract: A facial expression label is assigned to face image data of a person with high accuracy. A facial expression data set storage unit (110) stores a facial expression data set in which the facial expression label is assigned to the face images in which people belonging to various groups show various facial expressions. A facial expression sampling unit (11) acquires a face image in which a person belonging to the desired group shows a desired facial expression. A representative feature quantity calculation unit (12) determines a representative feature quantity for each facial expression label from the face image of the desired group. The target data extraction unit (13) extracts target data from a facial expression data set. A target feature quantity calculation unit (14) calculates a target feature quantity from the target data.
    Type: Application
    Filed: November 19, 2019
    Publication date: December 22, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Akihiko TAKASHIMA, Ryo MASUMURA
  • Patent number: 11462212
    Abstract: A document identification device that improves class identification precision of multi-stream documents is provided. The document identification device includes: a primary stream expression generation unit that generates a primary stream expression, which is a fixed-length vector of a word sequence corresponding to each speaker's speech recorded in a setting including a plurality of speakers, for each speaker; a primary multi-stream expression generation unit that generates a primary multi-stream expression obtained by integrating the primary stream expression; a secondary stream expression generation unit that generates a secondary stream expression, which is a fixed-length vector generated based on the word sequence of each speaker and the primary multi-stream expression, for each speaker; and a secondary multi-stream expression generation unit that generates a secondary multi-stream expression obtained by integrating the secondary stream expression.
    Type: Grant
    Filed: May 10, 2018
    Date of Patent: October 4, 2022
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo Masumura, Hirokazu Masataki
  • Publication number: 20220277767
    Abstract: A voice/non-voice determination device robust with respect to an acoustic signal in a high-noise environment is provided. The voice/non-voice determination device includes an acoustic scene classification unit including a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected, a speech enhancement unit including a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement, and a voice/non-voice determination unit including a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section.
    Type: Application
    Filed: July 25, 2019
    Publication date: September 1, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Takanobu OBA, Kiyoaki MATSUI
  • Publication number: 20220270637
    Abstract: An utterance section detection device which is capable of detecting an utterance section with high accuracy on the basis of whether or not an end of a speech section is an end of utterance.
    Type: Application
    Filed: July 24, 2019
    Publication date: August 25, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Takanobu OBA, Kiyoaki MATSUI
  • Patent number: 11380301
    Abstract: A learning apparatus comprises a learning part that learns an error correction model by a set of a speech recognition result candidate and a correct text of speech recognition for given audio data, wherein the speech recognition result candidate includes a speech recognition result candidate which is different from the correct text, and the error correction model is a model that receives a word sequence of the speech recognition result candidate as input and outputs an error correction score indicating likelihood of the word sequence of the speech recognition result candidate in consideration of a speech recognition error.
    Type: Grant
    Filed: February 18, 2019
    Date of Patent: July 5, 2022
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Tomohiro Tanaka, Ryo Masumura
  • Publication number: 20220139374
    Abstract: Provided a speech recognition device capable of implementing end-to-end speech. recognition considering a context.
    Type: Application
    Filed: January 27, 2020
    Publication date: May 5, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Tomohiro TANAKA, Takanobu OBA
  • Publication number: 20220108217
    Abstract: A model capable of estimating a label with high accuracy is learned even when training data involving a small number of raters per data item is used. Learning processing is performed in which a plurality of data items and label expectation values that are indicators representing degrees of correctness of individual labels on the data items are used in pairs as training data, and a model that estimates a label on an input data item is obtained.
    Type: Application
    Filed: January 29, 2020
    Publication date: April 7, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Hosana KAMIYAMA, Satoshi KOBASHIKAWA, Atsushi ANDO, Ryo MASUMURA
  • Publication number: 20220093079
    Abstract: Without dividing speech into a unit such as a word or a character, text corresponding to the speech is labeled. A speech distributed representation sequence converting unit 11 converts an acoustic feature sequence into a speech distributed representation. A symbol distributed representation converting unit 12 converts each symbol included in the symbol sequence corresponding to the acoustic feature sequence into a symbol distributed representation. A label estimation unit 13 estimates a label corresponding to the symbol from the fixed-length vector of the symbol generated using the speech distributed representation, the symbol distributed representation, and fixed-length vectors of previous and next symbols.
    Type: Application
    Filed: January 10, 2020
    Publication date: March 24, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Tomohiro TANAKA, Ryo MASUMURA, Takanobu OBA
  • Publication number: 20220036912
    Abstract: A tag estimation device capable of estimating, for an utterance made among several persons, a tag representing a result of analyzing the utterance is provided. The tag estimation device includes an utterance sequence information vector generation unit that adds a t-th utterance word feature vector and a t-th speaker vector to a (t?1)-th utterance sequence information vector ut-1 that includes an utterance word feature vector that precedes the t-th utterance word feature vector and a speaker vector that precedes the t-th speaker vector to generate a t-th utterance sequence information vector ut, where t is a natural number, and a tagging unit that determines a tag lt that represents a result of analyzing a t-th utterance from a model parameter set in advance and the t-th utterance sequence information vector ut.
    Type: Application
    Filed: September 13, 2019
    Publication date: February 3, 2022
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Ryo MASUMURA, Tomohiro TANAKA