Patents by Inventor Daisuke Takuma

Daisuke Takuma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PATTERN MATCHING BASED CHARACTER STRING RETRIEVAL

Publication number: 20170053039

Abstract: Embodiments relate to generating a retrieval condition for retrieving a target character string from texts by pattern matching. An aspect includes dividing a first text into words. Another aspect includes generating a converted character string by performing at least one of appending at least one character in at least either one of previous and subsequent positions of the target character string. Another aspect includes replacing at least one character of the target character string. Another aspect includes generating the retrieval condition for retrieval candidates in the words of the first text, the retrieval condition comprising determining that a retrieval candidate matches the target character string and does not match the converted character string based on a ratio of a part of the retrieval candidate which matches the converted character string and corresponds to the target character string is less than or equal to a reference frequency.

Type: Application

Filed: November 9, 2016

Publication date: February 23, 2017

Inventors: Emiko Takeuchi, Daisuke Takuma, Hirobumi Toyoshima
Detecting dangerous expressions based on a theme

Patent number: 9575959

Abstract: Embodiments relate to a dangerous expression based on a particular theme. An aspect includes acquiring, by an electronic apparatus, from text data for learning, a subset of the text data associated with the particular theme and with particular time period information. Another aspect includes extracting text data containing negative information from the acquired subset of the text data. Another aspect includes extracting a word or phrase having a high correlation with the extracted text data or a word or phrase having a high appearance frequency in the extracted text data from the extracted text data. Yet another aspect includes determining that the extracted word or phrase is the dangerous expression based on the particular theme.

Type: Grant

Filed: August 15, 2014

Date of Patent: February 21, 2017

Assignee: International Business Machines Corporation

Inventors: Emiko Takeuchi, Daisuke Takuma, Hirobumi Toyoshima
TEXT PROCESSING METHOD, SYSTEM AND COMPUTER PROGRAM

Publication number: 20160357852

Abstract: A method includes hierarchically identifying occurrences of some of the words in the set of sentences; creating a first index for each of some of the words based on the upper hierarchy of occurrences identified for each word; receiving input of a queried word; hierarchically identifying occurrences of the queried word in the set of sentences; creating a second index based on the upper hierarchy of occurrences identified for the queried word; comparing the first index and the second index to calculate an estimated value for the number of occurrences of a word in the neighborhood of the queried word; and calculating the actual value of the number of occurrences of a word in the neighborhood of the queried word based on an upper hierarchy and lower hierarchy of the occurrences on condition that the estimated value is equal to or greater than a predetermined number.

Type: Application

Filed: August 22, 2016

Publication date: December 8, 2016

Inventors: Daisuke Takuma, Hiroki Yanagisawa
Text processing method, system and computer program

Patent number: 9471548

Abstract: A method includes hierarchically identifying occurrences of some of the words in the set of sentences; creating a first index for each of some of the words based on the upper hierarchy of occurrences identified for each word; receiving input of a queried word; hierarchically identifying occurrences of the queried word in the set of sentences; creating a second index based on the upper hierarchy of occurrences identified for the queried word; comparing the first index and the second index to calculate an estimated value for the number of occurrences of a word in the neighborhood of the queried word; and calculating the actual value of the number of occurrences of a word in the neighborhood of the queried word based on an upper hierarchy and lower hierarchy of the occurrences on condition that the estimated value is equal to or greater than a predetermined number.

Type: Grant

Filed: August 8, 2013

Date of Patent: October 18, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Daisuke Takuma, Hiroki Yanagisawa
GENERATION APPARATUS, GENERATION METHOD, AND PROGRAM

Publication number: 20160124933

Abstract: Aspects of the present invention disclose a method, computer program product, and system for generating target text based on target data. The method includes one or more processors decomposing one or more portions of text into at least one corresponding keyword and at least one corresponding template. The method further includes learning a classification model associated with selecting a template based on a category of a keyword. The method further includes identifying a target keyword that is represented by target data. The method further includes selecting a target template that is used to represent the target data based on a category associated with the identified target keyword utilizing the classification model. The method further includes generating target text that represents the target data based on the selected text template based on the selected target template and the identified target keyword.

Type: Application

Filed: September 29, 2015

Publication date: May 5, 2016

Inventors: Emiko Takeuchi, Daisuke Takuma, Hirobumi Toyoshima
CALCULATING CORRELATIONS BETWEEN ANNOTATIONS

Publication number: 20150293907

Abstract: An apparatus for calculating a correlation between annotations includes a first obtaining unit configured to provide an annotator with a first data group capable of being evaluated to determine whether or not to attach annotations thereto, and obtaining a plurality of first confidence levels indicating certainty of the annotations in the first data group, the annotator outputting confidence levels indicating certainty of annotations to be attached to data when the data is given; a second obtaining unit configured to provide the annotator with a second data group used to calculate a correlation between the plurality of annotations, and thereby obtaining a plurality of second confidence levels indicating the certainty of the annotations in the second data group; and a computing unit configured to compute an estimated value of the correlation between the plurality of annotations based on the plurality of first and second confidence levels.

Type: Application

Filed: June 24, 2015

Publication date: October 15, 2015

Inventors: Yuki Makino, Takuma Murakami, Daisuke Takuma
CALCULATING CORRELATIONS BETWEEN ANNOTATIONS

Publication number: 20150278312

Abstract: An apparatus for calculating a correlation between annotations includes a first obtaining unit configured to provide an annotator with a first data group capable of being evaluated to determine whether or not to attach annotations thereto, and obtaining a plurality of first confidence levels indicating certainty of the annotations in the first data group, the annotator outputting confidence levels indicating certainty of annotations to be attached to data when the data is given; a second obtaining unit configured to provide the annotator with a second data group used to calculate a correlation between the plurality of annotations, and thereby obtaining a plurality of second confidence levels indicating the certainty of the annotations in the second data group; and a computing unit configured to compute an estimated value of the correlation between the plurality of annotations based on the plurality of first and second confidence levels.

Type: Application

Filed: March 16, 2015

Publication date: October 1, 2015

Inventors: Yuki Makino, Takuma Murakami, Daisuke Takuma
PATTERN MATCHING BASED CHARACTER STRING RETRIEVAL

Publication number: 20150242537

Abstract: Embodiments relate to generating a retrieval condition for retrieving a target character string from texts by pattern matching. An aspect includes dividing a first text into words. Another aspect includes generating a converted character string by performing at least one of appending at least one character in at least either one of previous and subsequent positions of the target character string. Another aspect includes replacing at least one character of the target character string. Another aspect includes generating the retrieval condition for retrieval candidates in the words of the first text, the retrieval condition comprising determining that a retrieval candidate matches the target character string and does not match the converted character string based on a ratio of a part of the retrieval candidate which matches the converted character string and corresponds to the target character string is less than or equal to a reference frequency.

Type: Application

Filed: February 24, 2015

Publication date: August 27, 2015

Inventors: Emiko Takeuchi, Daisuke Takuma, Hirobumi Toyoshima
CATEGORIZING KEYWORDS

Publication number: 20150227620

Abstract: A keyword to be categorized is received. A category dictionary including categories having associated registered keywords, and a text corpus are received. Registered keywords are identified in the category dictionary having a degree of similarity to the keyword to be categorized that is equal to or greater than a predetermined value, and the categories associated with the identified registered keywords are extracted. Registered keywords are identified that are co-occurring in the text corpus with the keyword to be categorized, and the categories associated with the identified co-occurring registered keywords are extracted. A degree of importance is determined for each extracted category based on a function of the identified registered keywords in the category dictionary and/or a function of the identified co-occurring registered keywords.

Type: Application

Filed: January 30, 2015

Publication date: August 13, 2015

Inventors: Emiko Takeuchi, Daisuke Takuma, Hirobumi Toyoshima
DETECTING DANGEROUS EXPRESSIONS BASED ON A THEME

Publication number: 20150100306

Abstract: Embodiments relate to a dangerous expression based on a particular theme. An aspect includes acquiring, by an electronic apparatus, from text data for learning, a subset of the text data associated with the particular theme and with particular time period information. Another aspect includes extracting text data containing negative information from the acquired subset of the text data. Another aspect includes extracting a word or phrase having a high correlation with the extracted text data or a word or phrase having a high appearance frequency in the extracted text data from the extracted text data. Yet another aspect includes determining that the extracted word or phrase is the dangerous expression based on the particular theme.

Type: Application

Filed: August 15, 2014

Publication date: April 9, 2015

Inventors: Emiko Takeuchi, Daisuke Takuma, Hirobumi Toyoshima
GRASPING A BIAS OF INFORMATION FROM AN INTERNET MEDIUM FOR SUPPORTING A SURVEY

Publication number: 20150039535

Abstract: In one embodiment of the present invention, an apparatus may be used for supporting a survey based on information in an Internet medium. The apparatus comprises: a first acquisition hardware unit, wherein the first acquisition hardware unit acquires first evaluation information representing a degree of evaluation acquired by a survey of a real society pertaining to a prescribed target; a second acquisition hardware unit, wherein the second acquisition hardware unit acquires second evaluation information representing a degree of evaluation in the Internet medium pertaining to the prescribed target; and an estimator hardware unit, wherein the estimator hardware unit estimates a bias in information in the Internet medium based on a deviation of the second evaluation information from the first evaluation information.

Type: Application

Filed: July 21, 2014

Publication date: February 5, 2015

Inventors: Emiko Takeuchi, Daisuke Takuma, Hirobumi Toyoshima
ANALYZING A DOCUMENT THAT INCLUDES A TEXT-BASED VISUAL REPRESENTATION

Publication number: 20150026553

Abstract: A hardware device analyzes a document that includes a text-based visual representation. A correspondence information hardware storage device holds known representations of graphical images as text-based visual representations. The graphical images depict portraits of physical objects. The text-based visual representations are associated with information that each describe one of the physical objects. An identification hardware device identifies a text-based visual representation within a document. The identification hardware device matches the text-based visual representation within the document to one or more of the text-based visual representations stored in the correspondence information hardware storage device. An editing hardware device retrieves information from the correspondence information hardware storage device that is identified, by the identification hardware device, as describing a text-based visual representation component within the document.

Type: Application

Filed: July 14, 2014

Publication date: January 22, 2015

Inventors: Emiko Takeuchi, Daisuke Takuma, Hirobumi Toyoshima
TEXT PROCESSING METHOD, SYSTEM AND COMPUTER PROGRAM

Publication number: 20140046654

Abstract: A method includes hierarchically identifying occurrences of some of the words in the set of sentences; creating a first index for each of some of the words based on the upper hierarchy of occurrences identified for each word; receiving input of a queried word; hierarchically identifying occurrences of the queried word in the set of sentences; creating a second index based on the upper hierarchy of occurrences identified for the queried word; comparing the first index and the second index to calculate an estimated value for the number of occurrences of a word in the neighborhood of the queried word; and calculating the actual value of the number of occurrences of a word in the neighborhood of the queried word based on an upper hierarchy and lower hierarchy of the occurrences on condition that the estimated value is equal to or greater than a predetermined number.

Type: Application

Filed: September 9, 2013

Publication date: February 13, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Daisuke Takuma, Hiroki Yanagisawa
TEXT PROCESSING METHOD, SYSTEM AND COMPUTER PROGRAM

Publication number: 20140046953

Abstract: A method includes hierarchically identifying occurrences of some of the words in the set of sentences; creating a first index for each of some of the words based on the upper hierarchy of occurrences identified for each word; receiving input of a queried word; hierarchically identifying occurrences of the queried word in the set of sentences; creating a second index based on the upper hierarchy of occurrences identified for the queried word; comparing the first index and the second index to calculate an estimated value for the number of occurrences of a word in the neighborhood of the queried word; and calculating the actual value of the number of occurrences of a word in the neighborhood of the queried word based on an upper hierarchy and lower hierarchy of the occurrences on condition that the estimated value is equal to or greater than a predetermined number.

Type: Application

Filed: August 8, 2013

Publication date: February 13, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Daisuke Takuma, Hiroki Yanagisawa
METHOD FOR CLASSIFYING PIECES OF TEXT ON BASIS OF EVALUATION POLARITY, COMPUTER PROGRAM PRODUCT, AND COMPUTER

Publication number: 20130289978

Abstract: A computer-implemented method, program product, and system, for extracting pieces of text from a plurality of pieces of text. The method includes: primarily evaluating a measure of positive expressions and a measure of negative expressions included in each of pieces of text; secondarily evaluating each of the pieces of text on the basis of a plurality of evaluation functions, where certain evaluation functions among the plurality of evaluation functions include, as variables, the measure of positive expressions and the measure of negative expressions; and extracting a piece of text having an evaluation result with a higher rating in preference to a piece of text having an evaluation result with a lower rating, where the individual evaluation results are based on the same evaluation function among the plurality of evaluation functions.

Type: Application

Filed: April 12, 2013

Publication date: October 31, 2013

Applicant: International Business Machines Corporation

Inventors: Hiroshi Kanayama, Takuma Murakami, Daisuke Takuma
Creating a terms dictionary with named entities or terminologies included in text data

Patent number: 8538745

Abstract: A computer system of an embodiment of the disclosure can be used to automatically create or populate a terms dictionary using a set of computing units. A morphological analysis unit can acquire token sequence data by performing morphological analysis for the text data. A category distinguishing unit can distinguish tokens of the token sequence data by using a category dictionary to extract uncategorized words. An uncategorized-word comparing unit can compare each of the extracted uncategorized words with an uncategorized-word comparison rule to extract an uncategorized word matching the uncategorized-word comparison rule as a registration candidate word. A token-sequence comparing unit can compare a token sequence of the token sequence data with a token-sequence comparison rule to extract a token sequence matching the token-sequence comparison rule as registration candidate words. A permission unit can permit a user to select whether to register the registration candidate words in the category dictionary.

Type: Grant

Filed: January 4, 2010

Date of Patent: September 17, 2013

Assignee: International Business Machines Corporation

Inventors: Hiroki Oya, Daisuke Takuma, Hirobumi Toyoshima
System, method and program for creating index for database

Patent number: 8190613

Abstract: A computer implemented method for creating indices for a database having a plurality of documents each being associated with one or more keywords. The method includes the steps of: dividing the database into a plurality of subsets; separating the keywords into a plurality of keyword groups based upon modulo G of the hash value of the keyword for each subset; reading each document of each subset to create a first sub-index and writing same to a storage device of the computer; reading the first sub-indices to merge the first sub-indices into a second sub-index for each keyword group to write same to the storage device; and reading the second sub-indices from the storage device to merge the second sub-indices into an index for the database and write same on the storage device. A program and a system for creating indices are also provided.

Type: Grant

Filed: June 3, 2008

Date of Patent: May 29, 2012

Assignee: International Business Machines Corporation

Inventors: Daisuke Takuma, Issei Yoshida
Information search system, method and program

Patent number: 8171052

Abstract: A system, method and computer program product for searching at high speed for documents matching a dependency pattern from document data containing a large volume of text documents. The system includes a storage device for storing, index storage means for storing in the storage device occurrence information, receiving means for receiving information, reading means for reading from the index storage means, and searching means for comparing occurrence information. The method and computer program product include the steps of storing in the storage device, receiving information, reading from the storage device, comparing occurrence information, and searching. The computer program product includes instructions to execute the steps of storing each of the plurality of document data in the storage device, storing in the storage device occurrence information.

Type: Grant

Filed: March 3, 2009

Date of Patent: May 1, 2012

Assignee: International Business Machines Corporation

Inventors: Daisuke Takuma, Yuta Tsuboi
Methods and apparatus for optimizing keyword data analysis

Patent number: 8001166

Abstract: Techniques for analyzing keyword data for quality management purposes are provided. One or more keywords are selected. Each of the one or more keywords represent a category of quality management. A keyword time series is prepared for each of the one or more selected keywords. A set of fixed form time series is prepared for each of the one or more selected keywords. The set of fixed form time series comprises one or more fixed form time series representing statistical data related to the one or more selected keywords. One or more correction sets comprising one or more correction parameters are obtained. Each of the one or more correction parameters correspond to one of the one or more fixed form time series within each set of fixed form time series. A set of corrected time series is generated for each of the one or more correction sets.

Type: Grant

Filed: March 28, 2008

Date of Patent: August 16, 2011

Assignee: International Business Machines Corporation

Inventors: Hirobumi Toyoshima, Daisuke Takuma, Hiroki Oya
System of effectively searching text for keyword, and method thereof

Patent number: 7945552

Abstract: A system of the present invention stores: a first index which designates lists of keywords contained in texts from identifications of the respective texts; a second index which designates lists of texts containing keywords from identifications of the respective keywords; and the number of texts containing the respective keywords. Then, upon receiving an input of a text search condition, the system calculates an estimation of search time by the first index and an estimation of search time by the second index, and determines which one of the first and second indexes makes a search faster. Then, by using the index which has been determined to make the search faster, the system searches for keywords which appear in texts satisfying the text search condition with higher frequency.

Type: Grant

Filed: March 26, 2008

Date of Patent: May 17, 2011

Assignee: International Business Machines Corporation

Inventors: Daisuke Takuma, Issei Yoshida, Yuta Tsuboi

prev 1 2 3 4 5 next