Patents by Inventor Ryo MASUMURA

Ryo MASUMURA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SOUND SOURCE SEPARATION METHOD, SOUND SOURCE SEPARATION APPARATUS, AND PROGARM

Publication number: 20240233744

Abstract: A mixed acoustic signal including sound emitted from a plurality of sound sources and sound source video signals representing at least one video of the plurality of sound sources are received as inputs, and at least a separated signal including a signal representing a target sound emitted from one sound source represented by the video is acquired. However, at least the separated signal is acquired using properties of the sound source that affects sound emitted by the sound source acquired from the video and/or features of a structure used for the sound source to emit the sound.

Type: Application

Filed: February 8, 2021

Publication date: July 11, 2024

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Naoki MAKISHIMA, Ryo MASUMURA
Tag estimation device, tag estimation method, and program

Patent number: 12002486

Abstract: A tag estimation device capable of estimating, for an utterance made among several persons, a tag representing a result of analyzing the utterance is provided. The tag estimation device includes an utterance sequence information vector generation unit that adds a t-th utterance word feature vector and a t-th speaker vector to a (t?1)-th utterance sequence information vector ut-1 that includes an utterance word feature vector that precedes the t-th utterance word feature vector and a speaker vector that precedes the t-th speaker vector to generate a t-th utterance sequence information vector ut, where t is a natural number, and a tagging unit that determines a tag lt that represents a result of analyzing a t-th utterance from a model parameter set in advance and the t-th utterance sequence information vector ut.

Type: Grant

Filed: September 13, 2019

Date of Patent: June 4, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Tomohiro Tanaka
SOUND SOURCE SEPARATION METHOD, SOUND SOURCE SEPARATION APPARATUS, AND PROGARM

Publication number: 20240135950

Abstract: A mixed acoustic signal including sound emitted from a plurality of sound sources and sound source video signals representing at least one video of the plurality of sound sources are received as inputs, and at least a separated signal including a signal representing a target sound emitted from one sound source represented by the video is acquired. However, at least the separated signal is acquired using properties of the sound source that affects sound emitted by the sound source acquired from the video and/or features of a structure used for the sound source to emit the sound.

Type: Application

Filed: February 8, 2021

Publication date: April 25, 2024

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Naoki MAKISHIMA, Ryo MASUMURA
Voice/non-voice determination device, voice/non-voice determination model parameter learning device, voice/non-voice determination method, voice/non-voice determination model parameter learning method, and program

Patent number: 11894017

Abstract: A voice/non-voice determination device robust with respect to an acoustic signal in a high-noise environment is provided. The voice/non-voice determination device includes an acoustic scene classification unit including a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected, a speech enhancement unit including a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement, and a voice/non-voice determination unit including a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section.

Type: Grant

Filed: July 25, 2019

Date of Patent: February 6, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Takanobu Oba, Kiyoaki Matsui
Language model score calculation apparatus, language model generation apparatus, methods therefor, program, and recording medium

Patent number: 11887620

Abstract: The present invention improves the accuracy of language prediction. A history speech meta-information understanding unit 11 obtains a history speech meta-information vector from a word string of a preceding speech using a meta-information understanding device. A history speech embedding unit 12 converts the word string of the preceding speech and a speaker label into a history speech embedding vector. A speech unit combination vector construction unit 13 obtains a speech unit combination vector by combining the history speech meta-information vector and the history speech embedding vector. A speech sequence embedding vector calculation unit 14 converts a plurality of speech unit combination vectors obtained for the past speech sequences to a speech sequence embedding vector. A language model score calculation unit 15 calculates a language model score of a current speech from a word string of the current speech, a speaker label, and a speech sequence embedding vector.

Type: Grant

Filed: January 27, 2020

Date of Patent: January 30, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Tomohiro Tanaka, Takanobu Oba
ENVIRONMENT ESTIMATION APPARATUS, ENVIRONMENT ESTIMATION METHOD, AND PROGRAM

Publication number: 20230245675

Abstract: To highly accurately estimate an environment in which an acoustic signal is collected without inputting auxiliary information. An input circuitry (21) inputs a target acoustic signal, which is an estimation target. An estimation circuitry (22) correlates an acoustic signal and an explanatory text for explaining the acoustic signal to estimate an environment in which the target acoustic signal is collected. The environment is an explanatory text for explaining the target acoustic signal obtained by the correlation. The correlation is so trained as to minimize a difference between an explanatory text assigned to the acoustic signal and an explanatory text obtained from the acoustic signal by the correlation.

Type: Application

Filed: May 11, 2020

Publication date: August 3, 2023

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Yuma KOIZUMI, Ryo MASUMURA, Shoichiro SAITO
WORK SYSTEM, MACHINE LEARNING DEVICE, AND MACHINE LEARNING METHOD

Publication number: 20230202030

Abstract: Provided is a work system including: an object imaging unit configured to acquire an object image by photographing an object from a work direction; a work position acquisition unit configured to acquire a work position based on an existence region of the object obtained from a machine learning model; and a work unit configured to execute work on the object based on a work position obtained by inputting the object image to the work position acquisition unit.

Type: Application

Filed: February 28, 2023

Publication date: June 29, 2023

Applicant: Kabushiki Kaisha Yaskawa Denki

Inventors: Ryo MASUMURA, Wataru WATANABE
MODEL LEARNING APPARATUS, METHOD AND PROGRAM FOR THE SAME

Publication number: 20230206118

Abstract: Provided is a model learning technology to learn a model in consideration of a difference in label assignment accuracy between experts and non-experts.

Type: Application

Filed: March 19, 2020

Publication date: June 29, 2023

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Hosana KAMIYAMA, Yuki KITAGISHI, Atsushi ANDO, Ryo MASUMURA, Takeshi MORI, Satoshi KOBASHIKAWA
MACHINE LEARNING DATA GENERATION DEVICE, MACHINE LEARNING MODEL GENERATION METHOD, AND STORAGE MEDIUM

Publication number: 20230134186

Abstract: Provided a machine learning data generation device including: at least one processor; and at least one memory device that stores a plurality of instructions which, when executed by the at least one processor, causes the at least one processor to execute: acquiring , in association with a predetermined label actual time series information; executing physical simulation of generating a plurality of pieces of virtual time series information; identifying parameter values based on the plurality of pieces of virtual time series information and the actual time series information, and to associate the identified parameter values with the label; generating a new parameter value and the label based on the identified parameter values; generating virtual time series information corresponding to a new internal state by executing physical simulation through use of the new parameter value; and generating new machine learning data.

Type: Application

Filed: December 26, 2022

Publication date: May 4, 2023

Inventors: Ryohei SUZUKI, Tsuyoshi Yokoya, Ryo Masumura, Hiroki Tachikake
LEARNING SYSTEM, IMAGE GENERATION SYSTEM, PRODUCTION SYSTEM, LEARNING METHOD, AND NON-TRANSITORY COMPUTER READABLE STORAGE MEDIUM

Publication number: 20230108419

Abstract: A learning system includes real environment image acquisition circuitry, virtual environment image generation circuitry, and GAN learning circuitry. The real environment image acquisition circuitry is configured to acquire a real environment image indicating a real environment in which real objects and a real background are provided. The virtual environment image generation circuitry is configured to generate a virtual environment image indicating a virtual environment in which virtual objects and a virtual background are provided. The virtual environment image includes at least one of the virtual background and the virtual objects which have a different color or different colors different from colors of the real background and the real objects. The GAN learning circuitry is configured to perform GAN (Generative Adversarial Networks) learning via which the virtual environment image is got more similar to the real environment image based on the real environment image and the virtual environment image.

Type: Application

Filed: October 5, 2022

Publication date: April 6, 2023

Applicant: KABUSHIKI KAISHA YASKAWA DENKI

Inventors: Makoto MORI, Ryo MASUMURA
SEQUENCE CONVERSION APPARATUS, MACHINE LEARNING APPARATUS, SEQUENCE CONVERSION METHOD, MACHINE LEARNING METHOD, AND PROGRAM

Publication number: 20230072015

Abstract: Information corresponding to a t-th word string Yt of a second text, which is a conversion result of a t-th word string Xt of a first text is estimated on the basis of a model parameter ?, by using, as inputs, a t-th word string Xt of the first text and a sequence Y{circumflex over (?)}1, . . . , Y{circumflex over (?)}t-1 of first to (t?1)-th word strings of the second text, which is a conversion result of a sequence X1, . . . Xt-1 of first to (t?1)-th word strings of the first text. Here, t is an integer of two or greater.

Type: Application

Filed: February 20, 2020

Publication date: March 9, 2023

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Mana IHORI, Ryo MASUMURA
Pronunciation error detection apparatus, pronunciation error detection method and program

Patent number: 11568761

Abstract: The present invention provides a pronunciation error detection apparatus capable of following a text without the need for a correct sentence even when erroneous recognition such as a reading error occurs.

Type: Grant

Filed: September 13, 2018

Date of Patent: January 31, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Satoshi Kobashikawa, Ryo Masumura, Hosana Kamiyama, Yusuke Ijima, Yushi Aono
Confusion network distributed representation generation apparatus, confusion network classification apparatus, confusion network distributed representation generation method, confusion network classification method and program

Patent number: 11556783

Abstract: There is provided a technique for transforming a confusion network to a representation that can be used as an input for machine learning. A confusion network distributed representation sequence generating part that generates a confusion network distributed representation sequence, which is a vector sequence, from an arc word set sequence and an arc weight set sequence constituting the confusion network is included.

Type: Grant

Filed: August 21, 2018

Date of Patent: January 17, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Hirokazu Masataki
MACHINE LEARNING MODEL DETERMINATION SYSTEM AND MACHINE LEARNING MODEL DETERMINATION METHOD

Publication number: 20230004870

Abstract: Provided is a machine learning model determination system including: at least one server and at least one client terminal; an evaluation information database which stores evaluation information being information on an evaluation of machine learning; an evaluation information update module which updates the evaluation information based on a specific value of a parameter and an evaluation of the machine learning through use of specific teaching data; a teaching data input module; a verification data input module; a parameter determination module which determines the specific value of the parameter based on the evaluation information; and a machine learning engine which includes a learning module which executes learning for a machine learning model through use of the specific teaching data, and an evaluation module which evaluates a result of the machine learning through use of the specific verification data.

Type: Application

Filed: September 9, 2022

Publication date: January 5, 2023

Inventors: Masaru ADACHI, Tsuyoshi YOKOYA, Ryo MASUMURA
FACIAL EXPRESSION LABELING APPARATUS, FACIAL EXPRESSION LABELLING METHOD, AND PROGRAM

Publication number: 20220406093

Abstract: A facial expression label is assigned to face image data of a person with high accuracy. A facial expression data set storage unit (110) stores a facial expression data set in which the facial expression label is assigned to the face images in which people belonging to various groups show various facial expressions. A facial expression sampling unit (11) acquires a face image in which a person belonging to the desired group shows a desired facial expression. A representative feature quantity calculation unit (12) determines a representative feature quantity for each facial expression label from the face image of the desired group. The target data extraction unit (13) extracts target data from a facial expression data set. A target feature quantity calculation unit (14) calculates a target feature quantity from the target data.

Type: Application

Filed: November 19, 2019

Publication date: December 22, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Akihiko TAKASHIMA, Ryo MASUMURA
Document identification device, document identification method, and program

Patent number: 11462212

Abstract: A document identification device that improves class identification precision of multi-stream documents is provided. The document identification device includes: a primary stream expression generation unit that generates a primary stream expression, which is a fixed-length vector of a word sequence corresponding to each speaker's speech recorded in a setting including a plurality of speakers, for each speaker; a primary multi-stream expression generation unit that generates a primary multi-stream expression obtained by integrating the primary stream expression; a secondary stream expression generation unit that generates a secondary stream expression, which is a fixed-length vector generated based on the word sequence of each speaker and the primary multi-stream expression, for each speaker; and a secondary multi-stream expression generation unit that generates a secondary multi-stream expression obtained by integrating the secondary stream expression.

Type: Grant

Filed: May 10, 2018

Date of Patent: October 4, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Hirokazu Masataki
VOICE/NON-VOICE DETERMINATION DEVICE, VOICE/NON-VOICE DETERMINATION MODEL PARAMETER LEARNING DEVICE, VOICE/NON-VOICE DETERMINATION METHOD, VOICE/NON-VOICE DETERMINATION MODEL PARAMETER LEARNING METHOD, AND PROGRAM

Publication number: 20220277767

Abstract: A voice/non-voice determination device robust with respect to an acoustic signal in a high-noise environment is provided. The voice/non-voice determination device includes an acoustic scene classification unit including a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected, a speech enhancement unit including a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement, and a voice/non-voice determination unit including a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section.

Type: Application

Filed: July 25, 2019

Publication date: September 1, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo MASUMURA, Takanobu OBA, Kiyoaki MATSUI
UTTERANCE SECTION DETECTION DEVICE, UTTERANCE SECTION DETECTION METHOD, AND PROGRAM

Publication number: 20220270637

Abstract: An utterance section detection device which is capable of detecting an utterance section with high accuracy on the basis of whether or not an end of a speech section is an end of utterance.

Type: Application

Filed: July 24, 2019

Publication date: August 25, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo MASUMURA, Takanobu OBA, Kiyoaki MATSUI
Learning apparatus, speech recognition rank estimating apparatus, methods thereof, and program

Patent number: 11380301

Abstract: A learning apparatus comprises a learning part that learns an error correction model by a set of a speech recognition result candidate and a correct text of speech recognition for given audio data, wherein the speech recognition result candidate includes a speech recognition result candidate which is different from the correct text, and the error correction model is a model that receives a word sequence of the speech recognition result candidate as input and outputs an error correction score indicating likelihood of the word sequence of the speech recognition result candidate in consideration of a speech recognition error.

Type: Grant

Filed: February 18, 2019

Date of Patent: July 5, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Tomohiro Tanaka, Ryo Masumura
SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND PROGRAM

Publication number: 20220139374

Abstract: Provided a speech recognition device capable of implementing end-to-end speech. recognition considering a context.

Type: Application

Filed: January 27, 2020

Publication date: May 5, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo MASUMURA, Tomohiro TANAKA, Takanobu OBA

1 2 next