Patents by Inventor Tetsuya Nasukawa

Tetsuya Nasukawa has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11797425
    Abstract: A computer-implemented method is provided for data augmentation. The method includes receiving a set of different base models already pretrained and a set of different test cases. The method further includes collecting a plurality of prediction results of the set of different test cases from the set of different base models. The method also includes identifying a test case as a candidate for the data augmentation based on a number of models in the set of different base models which fail to solve the test case. The method additionally includes augmenting, by a processor device, the identified test case with additional data to form an augmented training dataset. The method further includes retraining at least some of the different base models with the augmented training dataset.
    Type: Grant
    Filed: July 9, 2021
    Date of Patent: October 24, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Masayasu Muraoka, Issei Yoshida, Tetsuya Nasukawa
  • Patent number: 11645461
    Abstract: A method is provided for dictionary expansion. The method acquires an object from a user and adds the object to a set of objects previously acquired from the user that form an expandable dictionary. The method calculates a centroid based on the set. The method calculates a similarity score of each of a plurality of objects relative to the centroid for each of a plurality of object features to calculate a weighted sum of similarity scores for each of the plurality of objects. The method presents candidate objects selected among the plurality of objects based on the weighted sum. The method acquires, from the user, a preferred candidate object among the candidate objects. The method updates weights of the plurality of features to maximize the weighed sum of similarity scores for the preferred candidate object. The method expands the dictionary by adding the preferred candidate object to the expandable dictionary.
    Type: Grant
    Filed: February 10, 2020
    Date of Patent: May 9, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ryosuke Kohita, Issei Yoshida, Hiroshi Kanayama, Tetsuya Nasukawa
  • Patent number: 11636338
    Abstract: A computer-implemented method is provided for data augmentation. The method includes calculating, by a hardware processor for each of words in a text data, a word replacement probability based on a word occurrence frequency in the text data, wherein the word replacement probability decreases with increasing word occurrence frequency. The method additionally includes selectively replacing at least one of the words in the text data with words predicted therefor by a Bidirectional Neural Network Language Model (BiNNLM) to generate augmented text data, based on the word replacement probability.
    Type: Grant
    Filed: March 20, 2020
    Date of Patent: April 25, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Masayasu Muraoka, Tetsuya Nasukawa
  • Publication number: 20230027777
    Abstract: A computer-implemented method is provided for data augmentation. The method includes receiving a set of different base models already pretrained and a set of different test cases. The method further includes collecting a plurality of prediction results of the set of different test cases from the set of different base models. The method also includes identifying a test case as a candidate for the data augmentation based on a number of models in the set of different base models which fail to solve the test case. The method additionally includes augmenting, by a processor device, the identified test case with additional data to form an augmented training dataset. The method further includes retraining at least some of the different base models with the augmented training dataset.
    Type: Application
    Filed: July 9, 2021
    Publication date: January 26, 2023
    Inventors: Masayasu Muraoka, Issei Yoshida, Tetsuya Nasukawa
  • Patent number: 11556570
    Abstract: A computer-implemented method for extracting semantic relations is disclosed. In the method, a plurality of hierarchal structures that originates from a corpus of documents is obtained. Each hierarchal structure includes a plurality of elements having respective recitations included in a corresponding document. In the method, for each predetermined relationship between ancestor and descendant elements in the hierarchal structures, a first keyword list is extracted from the ancestor element and a second keyword list is extracted from the descendant element. A statistical index is calculated for each pair of first and second keywords using the first keyword lists and the second keyword lists. The index indicates a strength of association between the first and second keywords. In the method, a candidate list of keyword pairs having semantic relationships is output using the statistical index calculated for each pair.
    Type: Grant
    Filed: September 20, 2018
    Date of Patent: January 17, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shoko Suzuki, Tetsuya Nasukawa, Hiroshi Kanayama
  • Publication number: 20220358287
    Abstract: Frequent sequences extracted from a set of documents according to a common rule are obtained. Based on comparing occurrence frequencies of various sequences, confidence of the first frequent sequence being a label expression representing a document part in a target document is evaluated. Keywords are extracted from the target document based on evaluation of the confidence.
    Type: Application
    Filed: May 10, 2021
    Publication date: November 10, 2022
    Inventors: Tetsuya Nasukawa, Shoko Suzuki, Daisuke Takuma, Issei Yoshida
  • Patent number: 11308274
    Abstract: A computer-implemented method is provided. The method includes acquiring a seed word; calculating a similarity score of each of a plurality of words relative to the seed word for each of a plurality of models to calculate a weighted sum of similarity scores for each of the plurality of words; outputting a plurality of candidate words among the plurality of words; acquiring annotations indicating at least one of preferred words and non-preferred words among the plurality of the candidate words; updating weights of the plurality of models in a manner to cause weighted sums of similarity scores for the preferred words to be relatively larger than the weighted sums of the similarity scores for the non-preferred words, based on the annotations; and grouping the plurality of candidate words output based on the weighted sum of similarity scores calculated with updated weights of the plurality of models.
    Type: Grant
    Filed: May 17, 2019
    Date of Patent: April 19, 2022
    Assignee: International Business Machines Corporation
    Inventors: Ryosuke Kohita, Issei Yoshida, Tetsuya Nasukawa, Hiroshi Kanayama
  • Patent number: 11132393
    Abstract: A computer-implemented method for identifying an expression for a target concept, includes: obtaining a set of texts as a target set of texts, with each text being associated with one of images relevant to a target concept. Candidate expressions for the target concept are extracted from the target set of texts. The candidate expressions are characteristic of the target set of texts. Each image relevant to one of the candidate expressions is collected by using an image search engine. A target expression for the target concept is selected from the candidate expressions based on a comparison result of the target concept and the collected images.
    Type: Grant
    Filed: October 30, 2018
    Date of Patent: September 28, 2021
    Assignee: International Business Machines Corporation
    Inventors: Tetsuya Nasukawa, Masayasu Muraoka, Khan Md. Anwarus Salam
  • Publication number: 20210295149
    Abstract: A computer-implemented method is provided for data augmentation. The method includes calculating, by a hardware processor for each of words in a text data, a word replacement probability based on a word occurrence frequency in the text data, wherein the word replacement probability decreases with increasing word occurrence frequency. The method additionally includes selectively replacing at least one of the words in the text data with words predicted therefor by a Bidirectional Neural Network Language Model (BiNNLM) to generate augmented text data, based on the word replacement probability.
    Type: Application
    Filed: March 20, 2020
    Publication date: September 23, 2021
    Inventors: Masayasu Muraoka, Tetsuya Nasukawa
  • Publication number: 20210248315
    Abstract: A method is provided for dictionary expansion. The method acquires an object from a user and adds the object to a set of objects previously acquired from the user that form an expandable dictionary. The method calculates a centroid based on the set. The method calculates a similarity score of each of a plurality of objects relative to the centroid for each of a plurality of object features to calculate a weighted sum of similarity scores for each of the plurality of objects. The method presents candidate objects selected among the plurality of objects based on the weighted sum. The method acquires, from the user, a preferred candidate object among the candidate objects. The method updates weights of the plurality of features to maximize the weighed sum of similarity scores for the preferred candidate object. The method expands the dictionary by adding the preferred candidate object to the expandable dictionary.
    Type: Application
    Filed: February 10, 2020
    Publication date: August 12, 2021
    Inventors: Ryosuke Kohita, Issei Yoshida, Hiroshi Kanayama, Tetsuya Nasukawa
  • Patent number: 10970488
    Abstract: A computer-implemented method for finding an asymmetric relation between a plurality of target words is disclosed. The method includes preparing a plurality of image sets, each of which includes one or more images relevant to a corresponding one of the plurality of the target words. The method also includes obtaining a plurality of object labels for each of the plurality of image sets. The method further includes computing a representation for each of the plurality of the target words using the plurality of the object labels obtained for each of the plurality of image sets. The method includes further determining whether there is an asymmetric relation between the plurality of the target words using representations computed for the plurality of the target words.
    Type: Grant
    Filed: February 27, 2019
    Date of Patent: April 6, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Masayasu Muraoka, Tetsuya Nasukawa, Khan Md. Anwarus Salam
  • Patent number: 10929617
    Abstract: Text analysis includes determining one or more global analysis parameters based on backtranslation of a first corpus between supported languages. A new text analysis model is determined for an unsupported language based on the one or more global analysis parameters and a text analysis model for a first supported language. An input text is analyzed in the unsupported language with the new text analysis model.
    Type: Grant
    Filed: July 20, 2018
    Date of Patent: February 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Kohichi Kamijoh, Tetsuya Nasukawa, Yohei Ikawa, Masaki Ono
  • Publication number: 20200364298
    Abstract: A computer-implemented method is provided. The method includes acquiring a seed word; calculating a similarity score of each of a plurality of words relative to the seed word for each of a plurality of models to calculate a weighted sum of similarity scores for each of the plurality of words; outputting a plurality of candidate words among the plurality of words; acquiring annotations indicating at least one of preferred words and non-preferred words among the plurality of the candidate words; updating weights of the plurality of models in a manner to cause weighted sums of similarity scores for the preferred words to be relatively larger than the weighted sums of the similarity scores for the non-preferred words, based on the annotations; and grouping the plurality of candidate words output based on the weighted sum of similarity scores calculated with updated weights of the plurality of models.
    Type: Application
    Filed: May 17, 2019
    Publication date: November 19, 2020
    Inventors: Ryosuke Kohita, Issei Yoshida, Tetsuya Nasukawa, Hiroshi Kanayama
  • Publication number: 20200272696
    Abstract: A computer-implemented method for finding an asymmetric relation between a plurality of target words is disclosed. The method includes preparing a plurality of image sets, each of which includes one or more images relevant to a corresponding one of the plurality of the target words. The method also includes obtaining a plurality of object labels for each of the plurality of image sets. The method further includes computing a representation for each of the plurality of the target words using the plurality of the object labels obtained for each of the plurality of image sets. The method includes further determining whether there is an asymmetric relation between the plurality of the target words using representations computed for the plurality of the target words.
    Type: Application
    Filed: February 27, 2019
    Publication date: August 27, 2020
    Inventors: Masayasu Muraoka, Tetsuya Nasukawa, Khan Md. Anwarus Salam
  • Patent number: 10671882
    Abstract: A technique for use in analyzing multidimensional data is disclosed. In the technique, a subset of texts specified by a textual feature is selected from the multidimensional data. Each text of the subset is projected into a target image based on the corresponding spatial information to obtain a spatial distribution map for the textual feature. The similarity between the spatial distribution map for the textual feature and each property distribution map for each predefined property is determined. For the similarity exceeding a threshold, the textual feature is outputted as a notable textual feature.
    Type: Grant
    Filed: November 14, 2017
    Date of Patent: June 2, 2020
    Assignee: International Business Machines Corporation
    Inventors: Tetsuya Nasukawa, Kazuki Sato
  • Publication number: 20200134055
    Abstract: A computer-implemented method for identifying an expression for a target concept, includes: obtaining a set of texts as a target set of texts, with each text being associated with one of images relevant to a target concept. Candidate expressions for the target concept are extracted from the target set of texts. The candidate expressions are characteristic of the target set of texts. Each image relevant to one of the candidate expressions is collected by using an image search engine. A target expression for the target concept is selected from the candidate expressions based on a comparison result of the target concept and the collected images.
    Type: Application
    Filed: October 30, 2018
    Publication date: April 30, 2020
    Inventors: Tetsuya Nasukawa, Masayasu Muraoka, Khan Md. Anwarus Salam
  • Publication number: 20200097596
    Abstract: A computer-implemented method for extracting semantic relations is disclosed. In the method, a plurality of hierarchal structures that originates from a corpus of documents is obtained. Each hierarchal structure includes a plurality of elements having respective recitations included in a corresponding document. In the method, for each predetermined relationship between ancestor and descendant elements in the hierarchal structures, a first keyword list is extracted from the ancestor element and a second keyword list is extracted from the descendant element. A statistical index is calculated for each pair of first and second keywords using the first keyword lists and the second keyword lists. The index indicates a strength of association between the first and second keywords. In the method, a candidate list of keyword pairs having semantic relationships is output using the statistical index calculated for each pair.
    Type: Application
    Filed: September 20, 2018
    Publication date: March 26, 2020
    Inventors: Shoko Suzuki, Tetsuya Nasukawa, Hiroshi Kanayama
  • Publication number: 20200026761
    Abstract: Text analysis includes determining one or more global analysis parameters based on backtranslation of a first corpus between supported languages. A new text analysis model is determined for an unsupported language based on the one or more global analysis parameters and a text analysis model for a first supported language. An input text is analyzed in the unsupported language with the new text analysis model.
    Type: Application
    Filed: July 20, 2018
    Publication date: January 23, 2020
    Inventors: Kohichi Kamijoh, Tetsuya Nasukawa, Yohei Ikawa, Masaki Ono
  • Publication number: 20190147291
    Abstract: A technique for use in analyzing multidimensional data is disclosed. In the technique, a subset of texts specified by a textual feature is selected from the multidimensional data. Each text of the subset is projected into a target image based on the corresponding spatial information to obtain a spatial distribution map for the textual feature. The similarity between the spatial distribution map for the textual feature and each property distribution map for each predefined property is determined. For the similarity exceeding a threshold, the textual feature is outputted as a notable textual feature.
    Type: Application
    Filed: November 14, 2017
    Publication date: May 16, 2019
    Inventors: Tetsuya Nasukawa, Kazuki K. Sato
  • Publication number: 20190095525
    Abstract: A computer-implemented method, a computer program product, and a computer system for extracting an expression in a text for natural language processing. The computer system reads a text to generate a plurality of substrings in which each substring includes one or more units appearing in the text. The computer system obtains an image set for the each substring, using the one or more units as a query for an image search system; wherein the image set includes one or more images. The computer system calculates a deviation in the image set for the each substring. The computer system selects a respective one of the plurality of the substrings as an expression to be extracted, based on the deviation and a length of each substring.
    Type: Application
    Filed: September 27, 2017
    Publication date: March 28, 2019
    Inventors: MASAYASU MURAOKA, TETSUYA NASUKAWA