Patents by Inventor Tetsuya Nasukawa
Tetsuya Nasukawa has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11797425Abstract: A computer-implemented method is provided for data augmentation. The method includes receiving a set of different base models already pretrained and a set of different test cases. The method further includes collecting a plurality of prediction results of the set of different test cases from the set of different base models. The method also includes identifying a test case as a candidate for the data augmentation based on a number of models in the set of different base models which fail to solve the test case. The method additionally includes augmenting, by a processor device, the identified test case with additional data to form an augmented training dataset. The method further includes retraining at least some of the different base models with the augmented training dataset.Type: GrantFiled: July 9, 2021Date of Patent: October 24, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Masayasu Muraoka, Issei Yoshida, Tetsuya Nasukawa
-
Patent number: 11645461Abstract: A method is provided for dictionary expansion. The method acquires an object from a user and adds the object to a set of objects previously acquired from the user that form an expandable dictionary. The method calculates a centroid based on the set. The method calculates a similarity score of each of a plurality of objects relative to the centroid for each of a plurality of object features to calculate a weighted sum of similarity scores for each of the plurality of objects. The method presents candidate objects selected among the plurality of objects based on the weighted sum. The method acquires, from the user, a preferred candidate object among the candidate objects. The method updates weights of the plurality of features to maximize the weighed sum of similarity scores for the preferred candidate object. The method expands the dictionary by adding the preferred candidate object to the expandable dictionary.Type: GrantFiled: February 10, 2020Date of Patent: May 9, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ryosuke Kohita, Issei Yoshida, Hiroshi Kanayama, Tetsuya Nasukawa
-
Patent number: 11636338Abstract: A computer-implemented method is provided for data augmentation. The method includes calculating, by a hardware processor for each of words in a text data, a word replacement probability based on a word occurrence frequency in the text data, wherein the word replacement probability decreases with increasing word occurrence frequency. The method additionally includes selectively replacing at least one of the words in the text data with words predicted therefor by a Bidirectional Neural Network Language Model (BiNNLM) to generate augmented text data, based on the word replacement probability.Type: GrantFiled: March 20, 2020Date of Patent: April 25, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Masayasu Muraoka, Tetsuya Nasukawa
-
Publication number: 20230027777Abstract: A computer-implemented method is provided for data augmentation. The method includes receiving a set of different base models already pretrained and a set of different test cases. The method further includes collecting a plurality of prediction results of the set of different test cases from the set of different base models. The method also includes identifying a test case as a candidate for the data augmentation based on a number of models in the set of different base models which fail to solve the test case. The method additionally includes augmenting, by a processor device, the identified test case with additional data to form an augmented training dataset. The method further includes retraining at least some of the different base models with the augmented training dataset.Type: ApplicationFiled: July 9, 2021Publication date: January 26, 2023Inventors: Masayasu Muraoka, Issei Yoshida, Tetsuya Nasukawa
-
Patent number: 11556570Abstract: A computer-implemented method for extracting semantic relations is disclosed. In the method, a plurality of hierarchal structures that originates from a corpus of documents is obtained. Each hierarchal structure includes a plurality of elements having respective recitations included in a corresponding document. In the method, for each predetermined relationship between ancestor and descendant elements in the hierarchal structures, a first keyword list is extracted from the ancestor element and a second keyword list is extracted from the descendant element. A statistical index is calculated for each pair of first and second keywords using the first keyword lists and the second keyword lists. The index indicates a strength of association between the first and second keywords. In the method, a candidate list of keyword pairs having semantic relationships is output using the statistical index calculated for each pair.Type: GrantFiled: September 20, 2018Date of Patent: January 17, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Shoko Suzuki, Tetsuya Nasukawa, Hiroshi Kanayama
-
Publication number: 20220358287Abstract: Frequent sequences extracted from a set of documents according to a common rule are obtained. Based on comparing occurrence frequencies of various sequences, confidence of the first frequent sequence being a label expression representing a document part in a target document is evaluated. Keywords are extracted from the target document based on evaluation of the confidence.Type: ApplicationFiled: May 10, 2021Publication date: November 10, 2022Inventors: Tetsuya Nasukawa, Shoko Suzuki, Daisuke Takuma, Issei Yoshida
-
Patent number: 11308274Abstract: A computer-implemented method is provided. The method includes acquiring a seed word; calculating a similarity score of each of a plurality of words relative to the seed word for each of a plurality of models to calculate a weighted sum of similarity scores for each of the plurality of words; outputting a plurality of candidate words among the plurality of words; acquiring annotations indicating at least one of preferred words and non-preferred words among the plurality of the candidate words; updating weights of the plurality of models in a manner to cause weighted sums of similarity scores for the preferred words to be relatively larger than the weighted sums of the similarity scores for the non-preferred words, based on the annotations; and grouping the plurality of candidate words output based on the weighted sum of similarity scores calculated with updated weights of the plurality of models.Type: GrantFiled: May 17, 2019Date of Patent: April 19, 2022Assignee: International Business Machines CorporationInventors: Ryosuke Kohita, Issei Yoshida, Tetsuya Nasukawa, Hiroshi Kanayama
-
Patent number: 11132393Abstract: A computer-implemented method for identifying an expression for a target concept, includes: obtaining a set of texts as a target set of texts, with each text being associated with one of images relevant to a target concept. Candidate expressions for the target concept are extracted from the target set of texts. The candidate expressions are characteristic of the target set of texts. Each image relevant to one of the candidate expressions is collected by using an image search engine. A target expression for the target concept is selected from the candidate expressions based on a comparison result of the target concept and the collected images.Type: GrantFiled: October 30, 2018Date of Patent: September 28, 2021Assignee: International Business Machines CorporationInventors: Tetsuya Nasukawa, Masayasu Muraoka, Khan Md. Anwarus Salam
-
Publication number: 20210295149Abstract: A computer-implemented method is provided for data augmentation. The method includes calculating, by a hardware processor for each of words in a text data, a word replacement probability based on a word occurrence frequency in the text data, wherein the word replacement probability decreases with increasing word occurrence frequency. The method additionally includes selectively replacing at least one of the words in the text data with words predicted therefor by a Bidirectional Neural Network Language Model (BiNNLM) to generate augmented text data, based on the word replacement probability.Type: ApplicationFiled: March 20, 2020Publication date: September 23, 2021Inventors: Masayasu Muraoka, Tetsuya Nasukawa
-
Publication number: 20210248315Abstract: A method is provided for dictionary expansion. The method acquires an object from a user and adds the object to a set of objects previously acquired from the user that form an expandable dictionary. The method calculates a centroid based on the set. The method calculates a similarity score of each of a plurality of objects relative to the centroid for each of a plurality of object features to calculate a weighted sum of similarity scores for each of the plurality of objects. The method presents candidate objects selected among the plurality of objects based on the weighted sum. The method acquires, from the user, a preferred candidate object among the candidate objects. The method updates weights of the plurality of features to maximize the weighed sum of similarity scores for the preferred candidate object. The method expands the dictionary by adding the preferred candidate object to the expandable dictionary.Type: ApplicationFiled: February 10, 2020Publication date: August 12, 2021Inventors: Ryosuke Kohita, Issei Yoshida, Hiroshi Kanayama, Tetsuya Nasukawa
-
Patent number: 10970488Abstract: A computer-implemented method for finding an asymmetric relation between a plurality of target words is disclosed. The method includes preparing a plurality of image sets, each of which includes one or more images relevant to a corresponding one of the plurality of the target words. The method also includes obtaining a plurality of object labels for each of the plurality of image sets. The method further includes computing a representation for each of the plurality of the target words using the plurality of the object labels obtained for each of the plurality of image sets. The method includes further determining whether there is an asymmetric relation between the plurality of the target words using representations computed for the plurality of the target words.Type: GrantFiled: February 27, 2019Date of Patent: April 6, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Masayasu Muraoka, Tetsuya Nasukawa, Khan Md. Anwarus Salam
-
Patent number: 10929617Abstract: Text analysis includes determining one or more global analysis parameters based on backtranslation of a first corpus between supported languages. A new text analysis model is determined for an unsupported language based on the one or more global analysis parameters and a text analysis model for a first supported language. An input text is analyzed in the unsupported language with the new text analysis model.Type: GrantFiled: July 20, 2018Date of Patent: February 23, 2021Assignee: International Business Machines CorporationInventors: Kohichi Kamijoh, Tetsuya Nasukawa, Yohei Ikawa, Masaki Ono
-
Publication number: 20200364298Abstract: A computer-implemented method is provided. The method includes acquiring a seed word; calculating a similarity score of each of a plurality of words relative to the seed word for each of a plurality of models to calculate a weighted sum of similarity scores for each of the plurality of words; outputting a plurality of candidate words among the plurality of words; acquiring annotations indicating at least one of preferred words and non-preferred words among the plurality of the candidate words; updating weights of the plurality of models in a manner to cause weighted sums of similarity scores for the preferred words to be relatively larger than the weighted sums of the similarity scores for the non-preferred words, based on the annotations; and grouping the plurality of candidate words output based on the weighted sum of similarity scores calculated with updated weights of the plurality of models.Type: ApplicationFiled: May 17, 2019Publication date: November 19, 2020Inventors: Ryosuke Kohita, Issei Yoshida, Tetsuya Nasukawa, Hiroshi Kanayama
-
Publication number: 20200272696Abstract: A computer-implemented method for finding an asymmetric relation between a plurality of target words is disclosed. The method includes preparing a plurality of image sets, each of which includes one or more images relevant to a corresponding one of the plurality of the target words. The method also includes obtaining a plurality of object labels for each of the plurality of image sets. The method further includes computing a representation for each of the plurality of the target words using the plurality of the object labels obtained for each of the plurality of image sets. The method includes further determining whether there is an asymmetric relation between the plurality of the target words using representations computed for the plurality of the target words.Type: ApplicationFiled: February 27, 2019Publication date: August 27, 2020Inventors: Masayasu Muraoka, Tetsuya Nasukawa, Khan Md. Anwarus Salam
-
Patent number: 10671882Abstract: A technique for use in analyzing multidimensional data is disclosed. In the technique, a subset of texts specified by a textual feature is selected from the multidimensional data. Each text of the subset is projected into a target image based on the corresponding spatial information to obtain a spatial distribution map for the textual feature. The similarity between the spatial distribution map for the textual feature and each property distribution map for each predefined property is determined. For the similarity exceeding a threshold, the textual feature is outputted as a notable textual feature.Type: GrantFiled: November 14, 2017Date of Patent: June 2, 2020Assignee: International Business Machines CorporationInventors: Tetsuya Nasukawa, Kazuki Sato
-
Publication number: 20200134055Abstract: A computer-implemented method for identifying an expression for a target concept, includes: obtaining a set of texts as a target set of texts, with each text being associated with one of images relevant to a target concept. Candidate expressions for the target concept are extracted from the target set of texts. The candidate expressions are characteristic of the target set of texts. Each image relevant to one of the candidate expressions is collected by using an image search engine. A target expression for the target concept is selected from the candidate expressions based on a comparison result of the target concept and the collected images.Type: ApplicationFiled: October 30, 2018Publication date: April 30, 2020Inventors: Tetsuya Nasukawa, Masayasu Muraoka, Khan Md. Anwarus Salam
-
Publication number: 20200097596Abstract: A computer-implemented method for extracting semantic relations is disclosed. In the method, a plurality of hierarchal structures that originates from a corpus of documents is obtained. Each hierarchal structure includes a plurality of elements having respective recitations included in a corresponding document. In the method, for each predetermined relationship between ancestor and descendant elements in the hierarchal structures, a first keyword list is extracted from the ancestor element and a second keyword list is extracted from the descendant element. A statistical index is calculated for each pair of first and second keywords using the first keyword lists and the second keyword lists. The index indicates a strength of association between the first and second keywords. In the method, a candidate list of keyword pairs having semantic relationships is output using the statistical index calculated for each pair.Type: ApplicationFiled: September 20, 2018Publication date: March 26, 2020Inventors: Shoko Suzuki, Tetsuya Nasukawa, Hiroshi Kanayama
-
Publication number: 20200026761Abstract: Text analysis includes determining one or more global analysis parameters based on backtranslation of a first corpus between supported languages. A new text analysis model is determined for an unsupported language based on the one or more global analysis parameters and a text analysis model for a first supported language. An input text is analyzed in the unsupported language with the new text analysis model.Type: ApplicationFiled: July 20, 2018Publication date: January 23, 2020Inventors: Kohichi Kamijoh, Tetsuya Nasukawa, Yohei Ikawa, Masaki Ono
-
Publication number: 20190147291Abstract: A technique for use in analyzing multidimensional data is disclosed. In the technique, a subset of texts specified by a textual feature is selected from the multidimensional data. Each text of the subset is projected into a target image based on the corresponding spatial information to obtain a spatial distribution map for the textual feature. The similarity between the spatial distribution map for the textual feature and each property distribution map for each predefined property is determined. For the similarity exceeding a threshold, the textual feature is outputted as a notable textual feature.Type: ApplicationFiled: November 14, 2017Publication date: May 16, 2019Inventors: Tetsuya Nasukawa, Kazuki K. Sato
-
Publication number: 20190095525Abstract: A computer-implemented method, a computer program product, and a computer system for extracting an expression in a text for natural language processing. The computer system reads a text to generate a plurality of substrings in which each substring includes one or more units appearing in the text. The computer system obtains an image set for the each substring, using the one or more units as a query for an image search system; wherein the image set includes one or more images. The computer system calculates a deviation in the image set for the each substring. The computer system selects a respective one of the plurality of the substrings as an expression to be extracted, based on the deviation and a length of each substring.Type: ApplicationFiled: September 27, 2017Publication date: March 28, 2019Inventors: MASAYASU MURAOKA, TETSUYA NASUKAWA