Patents by Inventor Ryo MASUMURA
Ryo MASUMURA has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240424682Abstract: A robot control system comprising circuitry configured to acquire a designated area image showing a designated area, and execute image analysis on the designated area image to detect a current state of the designated area as a current area state. The circuitry is further configured to generate, based on the current area state, a state of the designated area which simulates one or more objects including an additional object as having been placed in the designated area, as a predicted area state. The circuitry is further configured to generate object information on the additional object which is simulated in the designated area, based on the predicted area state. The circuitry is further configured to control a robot so as to physically place the additional object in the designated area in accordance with the object information.Type: ApplicationFiled: September 3, 2024Publication date: December 26, 2024Inventors: Ryo MASUMURA, Hiroki TACHIKAKE, Keita SATSUMA, Keisuke NAKAMURA, Hisashi IDEGUCHI
-
Patent number: 12148418Abstract: A voice recognition device 10 includes: a phonological awareness feature amount extraction unit 11 that transforms an acoustic feature amount sequence of input voice into a phonological awareness feature amount sequence for the language 1 using a first model parameter group; a phonological awareness feature amount extraction unit 12 that transforms the acoustic feature amount sequence of the input voice into a phonological awareness feature amount sequence for the language 2 using a second model parameter group; a phonological recognition unit 13 that generates a posterior probability sequence from the acoustic feature amount sequence of the input voice, the phonological awareness feature amount sequence for the language 1, and the phonological awareness feature amount sequence for the language 2 using a third model parameter group; and a voice text transformation unit 14 that performs voice recognition based on the posterior probability sequence to output text of a voice recognition result.Type: GrantFiled: June 21, 2019Date of Patent: November 19, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo Masumura, Tomohiro Tanaka
-
Patent number: 12148246Abstract: A facial expression label is assigned to face image data of a person with high accuracy. A facial expression data set storage unit (110) stores a facial expression data set in which the facial expression label is assigned to the face images in which people belonging to various groups show various facial expressions. A facial expression sampling unit (11) acquires a face image in which a person belonging to the desired group shows a desired facial expression. A representative feature quantity calculation unit (12) determines a representative feature quantity for each facial expression label from the face image of the desired group. The target data extraction unit (13) extracts target data from a facial expression data set. A target feature quantity calculation unit (14) calculates a target feature quantity from the target data.Type: GrantFiled: November 19, 2019Date of Patent: November 19, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Akihiko Takashima, Ryo Masumura
-
Patent number: 12142258Abstract: Without dividing speech into a unit such as a word or a character, text corresponding to the speech is labeled. A speech distributed representation sequence converting unit 11 converts an acoustic feature sequence into a speech distributed representation. A symbol distributed representation converting unit 12 converts each symbol included in the symbol sequence corresponding to the acoustic feature sequence into a symbol distributed representation. A label estimation unit 13 estimates a label corresponding to the symbol from the fixed-length vector of the symbol generated using the speech distributed representation, the symbol distributed representation, and fixed-length vectors of previous and next symbols.Type: GrantFiled: January 10, 2020Date of Patent: November 12, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Tomohiro Tanaka, Ryo Masumura, Takanobu Oba
-
Patent number: 12136435Abstract: An utterance section detection device which is capable of detecting an utterance section with high accuracy on the basis of whether or not an end of a speech section is an end of utterance.Type: GrantFiled: July 24, 2019Date of Patent: November 5, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo Masumura, Takanobu Oba, Kiyoaki Matsui
-
Patent number: 12131729Abstract: A language model score calculation apparatus calculates a prediction probability of a word wi as a language model score of a language model based on a recurrent neural network. The language model score calculation apparatus includes a memory; and a processor configured to execute converting a word wi-1 that is observed immediately before the word wi into a word vector ?(wi-1); converting a speaker label ri-1 corresponding to the word wi-1 and a speaker label ri corresponding to the word wi into a speaker vector ?(ri-1) and a speaker vector ?(ri), respectively; calculating a word history vector si by using the word vector ?(wi-1), the speaker vector ?(ri-1), and a word history vector si-1 that is obtained when a prediction probability of the word wi-1 is calculated; and calculating a prediction probability of the word wi by using the word history vector si-1 and the speaker vector ?(ri).Type: GrantFiled: June 21, 2019Date of Patent: October 29, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo Masumura, Tomohiro Tanaka
-
Patent number: 12111643Abstract: An inspection system includes machine learning circuitry configured to determine whether each of objects belongs to a predetermined attribute based on feature data of each of the objects, feature data acquisition circuitry configured to acquire feature data of reevaluated objects which are determined to belong to the predetermined attribute without using the machine learning circuitry among excluded objects which are determined not to belong to the predetermined attribute by the machine learning circuitry, and parameter update circuitry configured to update a learning parameter of the machine learning circuitry based on teaching data including the acquired feature data acquired by the feature data acquisition circuitry.Type: GrantFiled: August 23, 2021Date of Patent: October 8, 2024Assignee: KABUSHIKI KAISHA YASKAWA DENKIInventors: Ryo Masumura, Masaru Adachi
-
Publication number: 20240290344Abstract: A tag estimation device capable of estimating, for an utterance made among several persons, a tag representing a result of analyzing the utterance is provided. The tag estimation device includes an utterance sequence information vector generation unit that adds a t-th utterance word feature vector and a t-th speaker vector to a (t?1)-th utterance sequence information vector ut?1 that includes an utterance word feature vector that precedes the t-th utterance word feature vector and a speaker vector that precedes the t-th speaker vector to generate a t-th utterance sequence information vector ut, where t is a natural number, and a tagging unit that determines a tag lt that represents a result of analyzing a t-th utterance from a model parameter set in advance and the t-th utterance sequence information vector ut.Type: ApplicationFiled: May 7, 2024Publication date: August 29, 2024Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo MASUMURA, Tomohiro TANAKA
-
Patent number: 12057105Abstract: Provided is a speech recognition device capable of implementing end-to-end speech recognition considering a context.Type: GrantFiled: January 27, 2020Date of Patent: August 6, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo Masumura, Tomohiro Tanaka, Takanobu Oba
-
Publication number: 20240233744Abstract: A mixed acoustic signal including sound emitted from a plurality of sound sources and sound source video signals representing at least one video of the plurality of sound sources are received as inputs, and at least a separated signal including a signal representing a target sound emitted from one sound source represented by the video is acquired. However, at least the separated signal is acquired using properties of the sound source that affects sound emitted by the sound source acquired from the video and/or features of a structure used for the sound source to emit the sound.Type: ApplicationFiled: February 8, 2021Publication date: July 11, 2024Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Naoki MAKISHIMA, Ryo MASUMURA
-
Patent number: 12002486Abstract: A tag estimation device capable of estimating, for an utterance made among several persons, a tag representing a result of analyzing the utterance is provided. The tag estimation device includes an utterance sequence information vector generation unit that adds a t-th utterance word feature vector and a t-th speaker vector to a (t?1)-th utterance sequence information vector ut-1 that includes an utterance word feature vector that precedes the t-th utterance word feature vector and a speaker vector that precedes the t-th speaker vector to generate a t-th utterance sequence information vector ut, where t is a natural number, and a tagging unit that determines a tag lt that represents a result of analyzing a t-th utterance from a model parameter set in advance and the t-th utterance sequence information vector ut.Type: GrantFiled: September 13, 2019Date of Patent: June 4, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo Masumura, Tomohiro Tanaka
-
Publication number: 20240135950Abstract: A mixed acoustic signal including sound emitted from a plurality of sound sources and sound source video signals representing at least one video of the plurality of sound sources are received as inputs, and at least a separated signal including a signal representing a target sound emitted from one sound source represented by the video is acquired. However, at least the separated signal is acquired using properties of the sound source that affects sound emitted by the sound source acquired from the video and/or features of a structure used for the sound source to emit the sound.Type: ApplicationFiled: February 8, 2021Publication date: April 25, 2024Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Naoki MAKISHIMA, Ryo MASUMURA
-
Patent number: 11894017Abstract: A voice/non-voice determination device robust with respect to an acoustic signal in a high-noise environment is provided. The voice/non-voice determination device includes an acoustic scene classification unit including a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected, a speech enhancement unit including a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement, and a voice/non-voice determination unit including a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section.Type: GrantFiled: July 25, 2019Date of Patent: February 6, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo Masumura, Takanobu Oba, Kiyoaki Matsui
-
Patent number: 11887620Abstract: The present invention improves the accuracy of language prediction. A history speech meta-information understanding unit 11 obtains a history speech meta-information vector from a word string of a preceding speech using a meta-information understanding device. A history speech embedding unit 12 converts the word string of the preceding speech and a speaker label into a history speech embedding vector. A speech unit combination vector construction unit 13 obtains a speech unit combination vector by combining the history speech meta-information vector and the history speech embedding vector. A speech sequence embedding vector calculation unit 14 converts a plurality of speech unit combination vectors obtained for the past speech sequences to a speech sequence embedding vector. A language model score calculation unit 15 calculates a language model score of a current speech from a word string of the current speech, a speaker label, and a speech sequence embedding vector.Type: GrantFiled: January 27, 2020Date of Patent: January 30, 2024Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Ryo Masumura, Tomohiro Tanaka, Takanobu Oba
-
Publication number: 20230245675Abstract: To highly accurately estimate an environment in which an acoustic signal is collected without inputting auxiliary information. An input circuitry (21) inputs a target acoustic signal, which is an estimation target. An estimation circuitry (22) correlates an acoustic signal and an explanatory text for explaining the acoustic signal to estimate an environment in which the target acoustic signal is collected. The environment is an explanatory text for explaining the target acoustic signal obtained by the correlation. The correlation is so trained as to minimize a difference between an explanatory text assigned to the acoustic signal and an explanatory text obtained from the acoustic signal by the correlation.Type: ApplicationFiled: May 11, 2020Publication date: August 3, 2023Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Yuma KOIZUMI, Ryo MASUMURA, Shoichiro SAITO
-
Publication number: 20230202030Abstract: Provided is a work system including: an object imaging unit configured to acquire an object image by photographing an object from a work direction; a work position acquisition unit configured to acquire a work position based on an existence region of the object obtained from a machine learning model; and a work unit configured to execute work on the object based on a work position obtained by inputting the object image to the work position acquisition unit.Type: ApplicationFiled: February 28, 2023Publication date: June 29, 2023Applicant: Kabushiki Kaisha Yaskawa DenkiInventors: Ryo MASUMURA, Wataru WATANABE
-
Publication number: 20230206118Abstract: Provided is a model learning technology to learn a model in consideration of a difference in label assignment accuracy between experts and non-experts.Type: ApplicationFiled: March 19, 2020Publication date: June 29, 2023Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Hosana KAMIYAMA, Yuki KITAGISHI, Atsushi ANDO, Ryo MASUMURA, Takeshi MORI, Satoshi KOBASHIKAWA
-
Publication number: 20230134186Abstract: Provided a machine learning data generation device including: at least one processor; and at least one memory device that stores a plurality of instructions which, when executed by the at least one processor, causes the at least one processor to execute: acquiring , in association with a predetermined label actual time series information; executing physical simulation of generating a plurality of pieces of virtual time series information; identifying parameter values based on the plurality of pieces of virtual time series information and the actual time series information, and to associate the identified parameter values with the label; generating a new parameter value and the label based on the identified parameter values; generating virtual time series information corresponding to a new internal state by executing physical simulation through use of the new parameter value; and generating new machine learning data.Type: ApplicationFiled: December 26, 2022Publication date: May 4, 2023Inventors: Ryohei SUZUKI, Tsuyoshi Yokoya, Ryo Masumura, Hiroki Tachikake
-
Publication number: 20230108419Abstract: A learning system includes real environment image acquisition circuitry, virtual environment image generation circuitry, and GAN learning circuitry. The real environment image acquisition circuitry is configured to acquire a real environment image indicating a real environment in which real objects and a real background are provided. The virtual environment image generation circuitry is configured to generate a virtual environment image indicating a virtual environment in which virtual objects and a virtual background are provided. The virtual environment image includes at least one of the virtual background and the virtual objects which have a different color or different colors different from colors of the real background and the real objects. The GAN learning circuitry is configured to perform GAN (Generative Adversarial Networks) learning via which the virtual environment image is got more similar to the real environment image based on the real environment image and the virtual environment image.Type: ApplicationFiled: October 5, 2022Publication date: April 6, 2023Applicant: KABUSHIKI KAISHA YASKAWA DENKIInventors: Makoto MORI, Ryo MASUMURA
-
Publication number: 20230072015Abstract: Information corresponding to a t-th word string Yt of a second text, which is a conversion result of a t-th word string Xt of a first text is estimated on the basis of a model parameter ?, by using, as inputs, a t-th word string Xt of the first text and a sequence Y{circumflex over (?)}1, . . . , Y{circumflex over (?)}t-1 of first to (t?1)-th word strings of the second text, which is a conversion result of a sequence X1, . . . Xt-1 of first to (t?1)-th word strings of the first text. Here, t is an integer of two or greater.Type: ApplicationFiled: February 20, 2020Publication date: March 9, 2023Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Mana IHORI, Ryo MASUMURA