Patents by Inventor Daisuke Takuma

Daisuke Takuma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7930320
    Abstract: An apparatus, a method, and a program for visualizing a Boolean expression so that it is readily recognized what is added to or excluded from conditions. A Boolean expression to be visualized is input in the form of a binary tree in which a leaf node represents an operand in the Boolean expression and a node other than the leaf node represents an operator in the Boolean expression. The input binary tree is transformed into a two-dimensional nested representation composed of a plurality of regions, and a pictorial representation for visualization is drawn on the basis of the nested representation and is displayed. When the Boolean expression is provided in a string expression, the string expression is transformed into a binary tree.
    Type: Grant
    Filed: October 12, 2007
    Date of Patent: April 19, 2011
    Assignee: International Business Machines Corporation
    Inventors: Kinya Kuriyama, Mariko Nagai, Daisuke Takuma
  • Patent number: 7917350
    Abstract: Calculates a word n-gram probability with high accuracy in a situation where a first corpus), which is a relatively small corpus containing manually segmented word information, and a second corpus, which is a relatively large corpus, are given as a training corpus that is storage containing vast quantities of sample sentences. Vocabulary including contextual information is expanded from words occurring in first corpus of relatively small size to words occurring in second corpus of relatively large size by using a word n-gram probability estimated from an unknown word model and the raw corpus. The first corpus (word-segmented) is used for calculating n-grams and the probability that the word boundary between two adjacent characters will be the boundary of two words (segmentation probability). The second corpus (word-unsegmented), in which probabilistic word boundaries are assigned based on information in the first corpus (word-segmented), is used for calculating a word n-grams.
    Type: Grant
    Filed: May 26, 2008
    Date of Patent: March 29, 2011
    Assignee: International Business Machines Corporation
    Inventors: Shinsuke Mori, Daisuke Takuma
  • Publication number: 20100174528
    Abstract: A computer system of an embodiment of the disclosure can be used to automatically create or populate a terms dictionary using a set of computing units. A morphological analysis unit can acquire token sequence data by performing morphological analysis for the text data. A category distinguishing unit can distinguish tokens of the token sequence data by using a category dictionary to extract uncategorized words. An uncategorized-word comparing unit can compare each of the extracted uncategorized words with an uncategorized-word comparison rule to extract an uncategorized word matching the uncategorized-word comparison rule as a registration candidate word. A token-sequence comparing unit can compare a token sequence of the token sequence data with a token-sequence comparison rule to extract a token sequence matching the token-sequence comparison rule as registration candidate words. A permission unit can permit a user to select whether to register the registration candidate words in the category dictionary.
    Type: Application
    Filed: January 4, 2010
    Publication date: July 8, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: HIROKI OYA, DAISUKE TAKUMA, HIROBUMI TOYOSHIMA
  • Publication number: 20090248628
    Abstract: Techniques for analyzing keyword data for quality management purposes are provided. One or more keywords are selected. Each of the one or more keywords represent a category of quality management. A keyword time series is prepared for each of the one or more selected keywords. A set of fixed form time series is prepared for each of the one or more selected keywords. The set of fixed form time series comprises one or more fixed form time series representing statistical data related to the one or more selected keywords. One or more correction sets comprising one or more correction parameters are obtained. Each of the one or more correction parameters correspond to one of the one or more fixed form time series within each set of fixed form time series. A set of corrected time series is generated for each of the one or more correction sets.
    Type: Application
    Filed: March 28, 2008
    Publication date: October 1, 2009
    Inventors: Hirobumi Toyoshima, Daisuke Takuma, Hiroki Oya
  • Publication number: 20090222407
    Abstract: A system, method and computer program product for searching at high speed for documents matching a dependency pattern from document data containing a large volume of text documents. The system includes a storage device for storing, index storage means for storing in the storage device occurrence information, receiving means for receiving information, reading means for reading from the index storage means, and searching means for comparing occurrence information. The method and computer program product include the steps of storing in the storage device, receiving information, reading from the storage device, comparing occurrence information, and searching. The computer program product includes instructions to execute the steps of storing each of the plurality of document data in the storage device, storing in the storage device occurrence information.
    Type: Application
    Filed: March 3, 2009
    Publication date: September 3, 2009
    Inventors: Daisuke Takuma, Yuta Tsuboi
  • Patent number: 7584184
    Abstract: A system of the present invention stores: a first index which designates lists of keywords contained in texts from identifications of the respective texts; a second index which designates lists of texts containing keywords from identifications of the respective keywords; and the number of texts containing the respective keywords. Then, upon receiving an input of a text search condition, the system calculates an estimation of search time by the first index and an estimation of search time by the second index, and determines which one of the first and second indexes makes a search faster. Then, by using the index which has been determined to make the search faster, the system searches for keywords which appear in texts satisfying the text search condition with higher frequency.
    Type: Grant
    Filed: November 2, 2006
    Date of Patent: September 1, 2009
    Assignee: International Business Machines Corporation
    Inventors: Daisuke Takuma, Issei Yoshida, Yuta Tsuboi
  • Patent number: 7571383
    Abstract: Enables retrieving document data appropriately reflecting content of a retrieval statement and detecting problems in sequentially added document data.
    Type: Grant
    Filed: July 13, 2005
    Date of Patent: August 4, 2009
    Assignee: International Business Machines Corporation
    Inventors: Hiroshi Nomiyama, Daisuke Takuma
  • Publication number: 20090030892
    Abstract: A system of the present invention stores: a first index which designates lists of keywords contained in texts from identifications of the respective texts; a second index which designates lists of texts containing keywords from identifications of the respective keywords; and the number of texts containing the respective keywords. Then, upon receiving an input of a text search condition, the system calculates an estimation of search time by the first index and an estimation of search time by the second index, and determines which one of the first and second indexes makes a search faster. Then, by using the index which has been determined to make the search faster, the system searches for keywords which appear in texts satisfying the text search condition with higher frequency.
    Type: Application
    Filed: March 26, 2008
    Publication date: January 29, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Daisuke Takuma, Issei Yoshida, Yuta Tsuboi
  • Publication number: 20080319987
    Abstract: An entire document set is decomposed into a sum of subsets each having no common part. Next, a set of keywords appearing in each of the subsets divided in the aforementioned manner is categorized into groups on the basis of a remainder resulting from dividing a hash value of each of the keywords by a certain fixed integer value. Thereby, index files for the respective groups are created. Among the index files prepared for the respective subsets of the document in the aforementioned manner, ones each having the same group number are merged. Thereby, integrated index files corresponding to the respective individual group numbers are created. Such index files, however, exist as many as the number of group numbers, and have not yet become an index corresponding to the entire document set. In this respect, the index files existing as many as the number of group numbers are next merged into one, and thereby, an index file corresponding to the entire document set is created.
    Type: Application
    Filed: June 3, 2008
    Publication date: December 25, 2008
    Inventors: Daisuke Takuma, Issei Yoshida
  • Publication number: 20080228463
    Abstract: Calculates a word n-gram probability with high accuracy in a situation where a first corpus), which is a relatively small corpus containing manually segmented word information, and a second corpus, which is a relatively large corpus, are given as a training corpus that is storage containing vast quantities of sample sentences. Vocabulary including contextual information is expanded from words occurring in first corpus of relatively small size to words occurring in second corpus of relatively large size by using a word n-gram probability estimated from an unknown word model and the raw corpus. The first corpus (word-segmented) is used for calculating n-grams and the probability that the word boundary between two adjacent characters will be the boundary of two words (segmentation probability). The second corpus (word-unsegmented), in which probabilistic word boundaries are assigned based on information in the first corpus (word-segmented), is used for calculating a word n-grams.
    Type: Application
    Filed: May 26, 2008
    Publication date: September 18, 2008
    Inventors: Shinsuke Mori, Daisuke Takuma
  • Publication number: 20080126160
    Abstract: A device for evaluating a trend analysis system comprises: an allowable value input unit for receiving allowable values of false positives and allowable values of false negatives made by the trend analysis system; and an accuracy computation unit for computing an accuracy of the trend analysis system as a function of the allowable values of false positives and the allowable values of false negatives.
    Type: Application
    Filed: November 29, 2007
    Publication date: May 29, 2008
    Inventors: Hironori Takuechi, Daisuke Takuma
  • Publication number: 20080104088
    Abstract: An apparatus, a method, and a program for visualizing a Boolean expression so that it is readily recognized what is added to or excluded from conditions. A Boolean expression to be visualized is input in the form of a binary tree in which a leaf node represents an operand in the Boolean expression and a node other than the leaf node represents an operator in the Boolean expression. The input binary tree is transformed into a two-dimensional nested representation composed of a plurality of regions, and a pictorial representation for visualization is drawn on the basis of the nested representation and is displayed. When the Boolean expression is provided in a string expression, the string expression is transformed into a binary tree.
    Type: Application
    Filed: October 12, 2007
    Publication date: May 1, 2008
    Applicant: International Business Machines Corporation
    Inventors: Kinya Kuriyama, Mariko Nagai, Daisuke Takuma
  • Publication number: 20070157123
    Abstract: In order to solve the above problem, disclosed as a first aspect is a method including the steps of analyzing a character string in a document into partial character strings; calculating, with respect to each of the partial character strings, a score incorporating appearance frequency of the partial character string; presenting the partial character strings and the scores to a user; determining which ones of the partial character strings have been selected by the user; storing the selected partial character strings as a safe partial character string list; and replacing, with predetermined replacement character strings, the partial character strings excluding the partial character strings existing in the safe partial character string list.
    Type: Application
    Filed: December 8, 2006
    Publication date: July 5, 2007
    Inventors: Yohei Ikawa, Hiroshi Kanayama, Daisuke Takuma
  • Publication number: 20070136274
    Abstract: A system of the present invention stores: a first index which designates lists of keywords contained in texts from identifications of the respective texts; a second index which designates lists of texts containing keywords from identifications of the respective keywords; and the number of texts containing the respective keywords. Then, upon receiving an input of a text search condition, the system calculates an estimation of search time by the first index and an estimation of search time by the second index, and determines which one of the first and second indexes makes a search faster. Then, by using the index which has been determined to make the search faster, the system searches for keywords which appear in texts satisfying the text search condition with higher frequency.
    Type: Application
    Filed: November 2, 2006
    Publication date: June 14, 2007
    Inventors: Daisuke Takuma, Issei Yoshida, Yuta Tsuboi
  • Publication number: 20060015486
    Abstract: Enables retrieving document data appropriately reflecting content of a retrieval statement and detecting problems in sequentially added document data.
    Type: Application
    Filed: July 13, 2005
    Publication date: January 19, 2006
    Applicant: International Business Machines Corporation
    Inventors: Hiroshi Nomiyama, Daisuke Takuma
  • Publication number: 20060015326
    Abstract: Calculates a word n-gram probability with high accuracy in a situation where a first corpus), which is a relatively small corpus containing manually segmented word information, and a second corpus, which is a relatively large corpus, are given as a training corpus that is storage containing vast quantities of sample sentences. Vocabulary including contextual information is expanded from words occurring in first corpus of relatively small size to words occurring in second corpus of relatively large size by using a word n-gram probability estimated from an unknown word model and the raw corpus. The first corpus (word-segmented) is used for calculating n-grams and the probability that the word boundary between two adjacent characters will be the boundary of two words (segmentation probability). The second corpus (word-unsegmented), in which probabilistic word boundaries are assigned based on information in the first corpus (word-segmented), is used for calculating a word n-grams.
    Type: Application
    Filed: July 13, 2005
    Publication date: January 19, 2006
    Applicant: International Business Machines Corporation
    Inventors: Shinsuke Mori, Daisuke Takuma