Linguistics Patents (Class 704/1)
  • Patent number: 10565311
    Abstract: A mechanism is provided updating a knowledge base of a sentiment analysis system, the knowledge base being operable for storing natural language terms and a score value related to each natural language term, the score value characterizing the sentiment of the natural language term. Messages comprising natural language are received. Using content of the knowledge base, a decision is made as to whether at least one message of the received messages has a positive sentiment or a negative sentiment. A term is extracted from the message that is not present in the knowledge base. Based on a frequency of occurrence of the term in the received messages and the sentiment of the messages in which the term occurs, a score value of the term is calculated, and the term and the calculated score value are stored into the knowledge base.
    Type: Grant
    Filed: February 15, 2017
    Date of Patent: February 18, 2020
    Assignee: International Business Machines Corporation
    Inventors: Michele Crudele, Antonio Perrone
  • Patent number: 10558757
    Abstract: Disclosed aspects relate to symbol management. A set of depictogram usage information may be mined with respect to a set of depictograms. A set of language attributes for the set of depictograms may be determined based on the set of depictogram usage information. A depictogram reference object may be compiled using the set of language attributes for the set of depictograms. A set of input data which includes a subset of the set of depictograms may be analyzed. The subset of the set of depictograms may be evaluated using the depictogram reference object. A set of output data may be provided.
    Type: Grant
    Filed: March 11, 2017
    Date of Patent: February 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Si Bin Fan, Su Liu, Yu Iu Liu, Cheng Xu
  • Patent number: 10558769
    Abstract: Systems and methods for automatically generating scenarios and user interface elements representing valuations of instruments under the scenarios are described. The systems and methods use expert polling systems and machine learning rules to generate tree data storage structures representing different scenarios of macro factors for outcomes of events. Machine implemented interfaces for expert polling, presentment of scenarios, and interaction with scenarios are also provided.
    Type: Grant
    Filed: January 4, 2019
    Date of Patent: February 11, 2020
    Assignee: Goldman Sachs & Co. LLC
    Inventors: Ron Dembo, Atul Pawar, Ezra Nahum, Andrew Phillips
  • Patent number: 10552538
    Abstract: Techniques disclose validating user-provided text feedback for topical relevance relative to a question asked. A form with at least a first field is received. The first field includes unstructured text content provided as feedback in response to a question. The unstructured text content of the first field is evaluated to identify an answer type. A measure of relevance of the unstructured text content relative to the question is determined based on the evaluation.
    Type: Grant
    Filed: September 24, 2015
    Date of Patent: February 4, 2020
    Assignee: International Business Machines Corporation
    Inventors: Adam T. Clark, Jeffrey K. Huebert, Aspen L. Payton, John E. Petri
  • Patent number: 10546304
    Abstract: A system and method for assessing the risk of a listing that transforms information from the listing into variables suitable for a classifier trained to score the riskiness of listings and using the score in addition to predetermined variable constraints to determine whether a listing is fraudulent.
    Type: Grant
    Filed: June 29, 2017
    Date of Patent: January 28, 2020
    Assignee: PAYPAL, INC.
    Inventors: Yael Cohen, Guy Ronen, Ran Yuchtman, Chen Kovacs
  • Patent number: 10545920
    Abstract: A method, system and computer program product for phrase substitution within chunks of substantially similar content. The method includes: retrieving from content files a first and a second content chunk which are identical above a predetermined threshold; identifying a candidate for substitution, wherein the candidate for substitution is a string of characters in the second content chunk that is not identical to a corresponding string of characters in the first content chunk; comparing the candidate for substitution with a synonym database to find a match, wherein the synonym database provides a plurality of synonym suggestions to convert the candidate for substitution in the first content chunk and the second content chuck to an identical string of characters; replacing the candidate for substitution with a reference to the identical string of characters; and storing a single copy of the identical string of characters in a common repository.
    Type: Grant
    Filed: August 4, 2015
    Date of Patent: January 28, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alka A Acharya, Lloyd W. Allen, Jr., Jana H Jenkins, Abigail Samuel
  • Patent number: 10540441
    Abstract: A device and method for providing recommended words for a character input by a user are provided. The method by which the device provides recommended words includes: receiving an input for inputting a character in a character input window; recommending at least one pseudo-morpheme including the input character by analyzing the input character; recommending at least one extended word including a selected pseudo-morpheme in response to receiving an input for selecting one of the at least one pseudo-morpheme; and displaying a selected extended word in response to receiving an input for selecting one of the at least one extended word.
    Type: Grant
    Filed: October 20, 2017
    Date of Patent: January 21, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hee-jun Song, Jung-wook Kim
  • Patent number: 10540987
    Abstract: A summary generating device includes a featural script extracting unit, a segment candidate generating unit, and a structuring estimating unit. The featural script extracting unit extracts featural script information of the words included in text information. Based on the extracted feature script information, the segment candidate generating unit generates candidates of segments that represent the constitutional units for the display purpose. Based on the generated candidates of segments and based on an estimation model for structuring, the structuring estimating unit estimates structure information containing information ranging from information of a comprehensive structure level to information of a local structure level.
    Type: Grant
    Filed: January 26, 2017
    Date of Patent: January 21, 2020
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kosei Fume, Taira Ashikawa, Masayuki Ashikawa, Takashi Masuko
  • Patent number: 10540962
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media for speech recognition. One method includes obtaining an input acoustic sequence, the input acoustic sequence representing an utterance, and the input acoustic sequence comprising a respective acoustic feature representation at each of a first number of time steps; processing the input acoustic sequence using a first neural network to convert the input acoustic sequence into an alternative representation for the input acoustic sequence; processing the alternative representation for the input acoustic sequence using an attention-based Recurrent Neural Network (RNN) to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings; and generating a sequence of substrings that represent a transcription of the utterance.
    Type: Grant
    Filed: May 3, 2018
    Date of Patent: January 21, 2020
    Inventors: William Chan, Navdeep Jaitly, Quoc V. Le, Oriol Vinyals, Noam M. Shazeer
  • Patent number: 10534861
    Abstract: A device may obtain a document. The device may identify a skip value for the document. The skip value may relate to a quantity of words or a quantity of characters that are to be skipped in an n-gram. The device may determine one or more skip n-grams using the skip value for the document. A skip n-gram, of the one or more skip n-grams, may include a sequence of one or more words or one or more characters with a set of occurrences in the document. The sequence of one or more words or one or more characters may include a skip value quantity of words or characters within the sequence. The device may extract one or more terms from the document based on the one or more skip n-grams. The device may provide information identifying the one or more terms.
    Type: Grant
    Filed: December 7, 2018
    Date of Patent: January 14, 2020
    Assignee: Accenture Global Services Limited
    Inventors: Anurag Dwarakanath, Aditya Priyadarshi, Bhanu Anand, Bindu Madhav Tummalapalli, Bargav Jayaraman, Nisha Ramachandra, Anitha Chandran, Parvathy Vijay Raghavan, Shalini Chaudhari, Neville Dubash, Sanjay Podder
  • Patent number: 10528661
    Abstract: A computer-implemented method includes identifying at least one parse tree. The method includes identifying a pattern library. The method includes searching the pattern library for patterns that match at least one fragment of any of the at least one parse tree. The method includes determining whether the at least one parse tree is fully matched by a combination of matching patterns from the pattern library. The method includes ranking the at least one parse tree based on an extent to which the at least one parse tree is fully matched by the combination of matching patterns from the pattern library.
    Type: Grant
    Filed: February 11, 2016
    Date of Patent: January 7, 2020
    Assignee: International Business Machines Corporation
    Inventors: Yishai A. Feldman, Eyal Shnarch
  • Patent number: 10510266
    Abstract: Systems and methods for augmentative and alternative communication that provide language communication facilitation and language acquisition enablement. In one embodiment, an AAC apparatus includes a user i/o device, an auditory output device and a microprocessor, wherein the microprocessor presents PICS buttons that are mapped to corresponding words to a user via the i/o device and accepts input via selection of the PICS buttons. In response to selection of a PICS button, the corresponding word is displayed to the user in a speech text box and produces a sound of the word via the auditory output device. The microprocessor further identifies and displays a subsequent set PICS buttons in dependence on the selected PICS button. The subsequent set PICS buttons may also be identified in dependence on word order, grammar rules, statistical and context analyses, and the like to increase navigation speed and to enable the user to learn language skills.
    Type: Grant
    Filed: September 23, 2016
    Date of Patent: December 17, 2019
    Inventor: Alexander T. Huynh
  • Patent number: 10510342
    Abstract: Provided herein is a voice recognition server and a control method thereof, the method including determining an index value for each of a plurality of training texts; setting a group for each of the plurality of training texts based on the index values of the plurality of training texts, and matching a function corresponding to each group and storing the matched results; in response to receiving a user's uttered voice from a user terminal apparatus, determining an index value from the received uttered voice; and searching a group corresponding to the index value determined from the received uttered voice, and performing the function corresponding to the uttered voice, thereby providing a voice recognition result of a variety of user's uttered voices suitable to the user's intentions.
    Type: Grant
    Filed: March 8, 2016
    Date of Patent: December 17, 2019
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kyung-min Lee, Il-hwan Kim, Chi-youn Park, Young-ho Han, Nam-hoon Kim, Jae-won Lee
  • Patent number: 10489488
    Abstract: A system and method for automatically generating a narrative story receives data and information pertaining to a domain event. The received data and information and/or one or more derived features are then used to identify a plurality of angles for the narrative story. The plurality of angles is then filtered, for example through use of parameters that specify a focus for the narrative story, length of the narrative story, etc. Points associated with the filtered plurality of angles are then assembled and the narrative story is rendered using the filtered plurality of angles and the assembled points.
    Type: Grant
    Filed: June 4, 2018
    Date of Patent: November 26, 2019
    Assignee: NARRATIVE SCIENCE INC.
    Inventors: Lawrence A. Birnbaum, Kristian J. Hammond, Nicholas D. Allen, John R. Templon
  • Patent number: 10482182
    Abstract: A natural language understanding (NLU) system used in a dialogue systems comprises a first-level NLU sub-system and at least one second-level NLU sub-system. Each second-level NLU sub-system is communicatively coupled with, and has a relatively higher performance than, the first-level NLU sub-system. The first-level NLU sub-system performs a first calculation over an input text received, and then outputs a first meaning if the first meaning is generated with a first confidence level surpassing a first threshold or passes on the input text to one second-level NLU sub-system based on a pre-determined rule if otherwise. Each second-level NLU sub-system receives the input text from the first-level NLU sub-system, and performs a second calculation over the input text, and then outputs a second meaning if the second meaning is generated with a second confidence level surpassing a second threshold or outputs a result indicating a rejection of meaning if otherwise.
    Type: Grant
    Filed: September 18, 2018
    Date of Patent: November 19, 2019
    Assignee: CloudMinds Technology, Inc.
    Inventor: Charles Robert Jankowski, Jr.
  • Patent number: 10481764
    Abstract: The present invention is, in one embodiment, a system and method based on a client-server architecture for seamlessly integrating various information systems. In one embodiment, a bundle of files is deployed to a collaboration client, in order to enable the integration of the collaboration client with disparate information systems and content. In one embodiment, content within a mailbox item is allowed to become a “live” object, and is associated with trigger events, and actions, including sending all or part of the content to a separate information system. In one embodiment, mailbox items are subjected to predefined searches to assess whether they include certain content objects. In another embodiment, panel item elements are visible in the overview panel of the collaboration client. The user may interact with the panel items by dragging content onto them, double clicking them, and invoking actions from a context menu if one is available.
    Type: Grant
    Filed: November 30, 2015
    Date of Patent: November 19, 2019
    Assignee: VMware, Inc.
    Inventors: Ross Dargahi, Kevin M. Henrikson, Roland Schemers, Jong Yoon Lee
  • Patent number: 10474747
    Abstract: An approach is provided to adjust time dependent terminology in a question and answering (QA) system. The approach ingests a set of documents to produce a corpus utilized by the QA system. A base time is established and the approach acquires a temporally accurate lexicon of terms that correspond to the base time. A corpus of the QA system is updated according to the lexicon. The QA system answers a question according to the updated corpus.
    Type: Grant
    Filed: December 16, 2013
    Date of Patent: November 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Daniel M. Jamrog, Jason D. LaVoie, Nicholas W. Orrick, Kristin A. Witherspoon
  • Patent number: 10466978
    Abstract: As a user uses a programming system to create programs, data are stored into a computer memory. The data describe actions of the user in creating the programs. The programming system has a user interface and a set of templates for functions. The user interface is designed to receive input from the user to direct the system to assemble functions from the set into the programs, the functions being functions for processing of data. As the user uses the user interface to assemble a program, suggestions to the user are computed, the suggestions recommending functions to be added into the program. The computation of function suggestion is based at least in part on the stored action data.
    Type: Grant
    Filed: January 19, 2017
    Date of Patent: November 5, 2019
    Assignee: Composable Analytics, Inc.
    Inventors: Andy Vidan, Lars Henry Fiedler
  • Patent number: 10467342
    Abstract: A method and an apparatus for determining a semantic matching degree. The method includes acquiring a first sentence and a second sentence, dividing the first sentence and the second sentence into x and y sentence fragments, respectively, performing a convolution operation on word vectors in each sentence fragment of the first sentence and word vectors in each sentence fragment of the second sentence, to obtain a three-dimensional tensor, performing integration and/or screening on adjacent vectors in the one-dimensional vectors of x rows and y columns, until the three-dimensional tensor is combined into a one-dimensional target vector, and determining a semantic matching degree between the first sentence and the second sentence according to the target vector.
    Type: Grant
    Filed: March 31, 2016
    Date of Patent: November 5, 2019
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Zhengdong Lu, Hang Li
  • Patent number: 10459704
    Abstract: Disclosed are devices, systems, apparatus, methods, products, media, and other implementations, including a method that includes generating for a code segment of a first process an instruction dependency graph representative of behavior of the first process, obtaining respective one or more instruction dependency graphs representative of behaviors of code segments for one or more other processes, and determining, based on the first instruction dependency graph for the first process and the respective one or more instruction dependency graphs for the one or more other processes, a level of similarity between the first process and at least one of the one or more other processes.
    Type: Grant
    Filed: February 8, 2016
    Date of Patent: October 29, 2019
    Assignee: The Trustees of Columbia University in the City of New York
    Inventors: Fang-hsiang Su, Lakshminarasimhan Sethumadhavan, Gail E. Kaiser, Tony Jebara
  • Patent number: 10460041
    Abstract: Some embodiments of an efficient string search have been presented. In one embodiment, a string of bytes representing content written in a non-delimited language is received, wherein the content has been classified into a predetermined category. In a single pass through the string of bytes, a set of N-grams is searched for simultaneously. Statistical information on occurrences of the N-grams, if any, in the string of bytes is collected. In some embodiments, a model is generated based on the statistical information, where the model is usable by a content filter to classify content.
    Type: Grant
    Filed: January 10, 2017
    Date of Patent: October 29, 2019
    Assignee: SONICWALL INC.
    Inventors: Thomas E. Raffill, Shunhui Zhu, Roman Yanovsky, Boris Yanovsky, John Gmuender
  • Patent number: 10460229
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for disambiguating word sense. One of the methods includes maintaining a respective word sense numeric representation of each of a plurality of word senses of a particular word; receiving a request to determine the word sense of the particular word when included in a particular text sequence, the particular text sequence comprising one or more context words and the particular word; determining a context numeric representation of the context words in the particular text sequence; and selecting a word sense of the plurality of word senses having a word sense numeric representation that is closest to the context numeric representation as the word sense of the particular word when included in the particular text sequence.
    Type: Grant
    Filed: March 20, 2017
    Date of Patent: October 29, 2019
    Assignee: Google LLC
    Inventors: Dayu Yuan, Ryan P. Doherty, Colin Hearne Evans, Julian David Christian Richardson, Eric E. Altendorf
  • Patent number: 10459925
    Abstract: According to the present invention there is provided a computer-enabled method of assisting to generate an innovation, the method comprising the steps of: retrieving from a database a first set of more than two documents belonging to a first domain (D1); retrieving from said database a second set of more than two documents belonging to a second domain (D2); selecting all possible combinations of documents from the first set with all documents in said second set, and for each combination of documents: determining a composite novelty score, a composite proximity score and a composite impact score; and based on all of the determined composite novelty scores and/or composite proximity scores and/or composite impact scores, providing a recommendation which can assist to generate an innovation.
    Type: Grant
    Filed: December 8, 2014
    Date of Patent: October 29, 2019
    Assignee: IPROVA SARL
    Inventors: Debmalya Biswas, Julian C. Nolan, Matthew J. Lawrenson
  • Patent number: 10446151
    Abstract: A speech recognition method in a system is provided that controls one or more devices by using speech recognition. The method includes obtaining speech information representing speech spoken by a user, and determining whether the speech is spoken to the one or more devices. The method also includes generating, in a case where it is determined that the speech is spoken to the one or more devices, an operation instruction for the one or more devices. The determining whether the speech is spoken to the one or more devices includes analyzing a sentence pattern of the character information, determining, in a case where the sentence pattern is interrogative or imperative, that the speech is spoken to the one or more devices, and determining, in a case where the sentence pattern is declarative or exclamatory, that the speech is not spoken to the one or more devices.
    Type: Grant
    Filed: November 27, 2017
    Date of Patent: October 15, 2019
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventor: Kazuya Nomura
  • Patent number: 10445070
    Abstract: An approach to generating an application prototype. The approach parses ASCII text-based requirements into a collection of sentences and parses the collection of sentences into collections words associated with the collection of sentences. The approach then uses an ASCII dictionary to determine the nouns and verbs found in the collections of words marking the nouns as entities and the verbs as responsibilities. Further, the approach determines if nouns are shared among the collection of sentences and if they are, then records relationships between the sentences. The approach then generates metadata describing these components and generates byte code based on the metadata. The approach packages the byte code, other data relating to the entry point and type of prototype application, e.g., web-based or standalone and an encryption module for distribution.
    Type: Grant
    Filed: May 5, 2016
    Date of Patent: October 15, 2019
    Assignee: International Business Machines Corporation
    Inventors: Santanu Bandyopadhyay, Ramesh C. Pathak, Suryanarayana K. Rao, Sautam Sengupta
  • Patent number: 10445426
    Abstract: A method and system to identify similar names and addresses from given data set comprising plurality of names and addresses. The invention more specifically addresses the challenge faced in Spanish data quality assurance. The name and data is parsed through parsing engine to parse the plurality of Spanish names and addresses. The parsed Spanish names and addresses are sent to a Probable identification engine to identify the probable matches. The combination of name and address matching process can be used for assuring data quality for Spanish names and addresses. The Spanish name matching process consists of identification of probable matches and finding similarity percentages between those probable. Similarly, the Spanish address matching process consists of identification of probable matches (criteria like same city) and finding similarity percentages between those probable. The system includes a parsing engine, a probable identification engine and a match percentage calculation engine.
    Type: Grant
    Filed: March 6, 2019
    Date of Patent: October 15, 2019
    Assignee: Tata Consultancy Services Limited
    Inventors: Ashish Diwan, Nandish Kirtikumar Solanki, Sridhar G. Pattar, Sudhir Kumar
  • Patent number: 10445331
    Abstract: A method for electronically mining intellectual property using an associative discovery process may include determining a set of documents containing keywords and/or phrases associated with an industry trend of interest and, for each document, assigning a weight. The method may also include selecting a subset of the documents based at least upon the assigned weights, and determining a feedback score for each document in the subset. The method may further include determining an optimal weighing scheme for the determined keywords and/or phrases using a statistical learning model and the feedback scores, ranking all documents in the set of documents according to the optimal weighing scheme, and providing results of the associative discovery process to a user.
    Type: Grant
    Filed: January 4, 2018
    Date of Patent: October 15, 2019
    Assignee: STATE FARM MUTUAL AUTOMOBILE INSURANCE COMPANY
    Inventors: Brian Mark Fields, Jufeng Peng, Jason Freeck, James Maxwell McWilliams, Mark O'Flaherty, Pat J. Johnson
  • Patent number: 10430413
    Abstract: A data information framework collects related data sharing characteristics (e.g., personal information, others) revealed by associated purpose information, and reports on that data. The location of the data is not restricted, and can be collected from various locations (e.g. different databases on different computer systems). An engine implements data creation defining links between different stored data structures (e.g., tables) using specific fields. A plurality of tables may be grouped into a smaller number of table clusters to facilitate constructing the data model. The model may be evaluated, enhanced, and/or corrected (e.g., by a user). The model may include fields reflecting the purpose information for the stored data, said fields accessible by the engine during data handling processes. The data model may include descriptions providing data storage location. Purpose information may be mapped to table fields.
    Type: Grant
    Filed: March 15, 2016
    Date of Patent: October 1, 2019
    Assignee: SAP SE
    Inventors: Bjoern Christoph, Marco Valentin, Carsten Pluder, Volker Lehnert, Johannes Gilbert
  • Patent number: 10425315
    Abstract: A personal digital assistant device includes: a memory storing an interactive personal digital assistant program and a processor configured to execute the interactive personal digital assistant program. The interactive personal digital assistant program performs an operation to determine whether the service provider is automated or is not automated. The interactive personal digital assistant program is configured to issue a command to the service provider on behalf of a user of the device, when it is determined that the service provider is automated. The interactive personal digital assistant program is configured to issue an alert on the device when it is determined that the service provider is not automated. The interactive personal digital assistant program may continue until the goal of the interaction is met or human help is sought.
    Type: Grant
    Filed: March 6, 2017
    Date of Patent: September 24, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Biplav Srivastava, Kartik Talamadupula
  • Patent number: 10417301
    Abstract: Various methods and systems for performing analytics based on hierarchical categorization of content are provided. Analytics can be performed using an index building workflow and a classification workflow. In the index building workflow, documents are received and analyzed to extract features from the documents. Hierarchical category paths can be identified for the features. The documents are indexed to support searching the documents for the hierarchical category paths. In the classification workflow, a query, that includes or references content, may be received and analyzed to extract features from the content. The features are executed against a search engine that returns search result documents associated with hierarchical category paths. The hierarchical category paths from the search result documents may be used to generate a topic model of the content associated with the query.
    Type: Grant
    Filed: September 10, 2014
    Date of Patent: September 17, 2019
    Inventors: Walter Wei-Tuh Chang, Kenneth Edward Feuerman, Shantanu Kumar, Ankit Bal
  • Patent number: 10417346
    Abstract: A computer-implemented technique is described for facilitating the creation of a language understanding (LU) component for use with an application. The technique allows a developer to select a subset of parameters from a larger set of parameters. The subset of parameters pertains to a LU scenario to be handled by the application. The larger set of parameters pertains to a plurality of LU scenarios handled by an already-existing generic LU model. The technique creates a constrained LU component that is based on the subset of parameters in conjunction with the generic LU model. At runtime, the constrained LU component interprets input language items using the generic LU model in a manner that is constrained by the subset of parameters that have been selected, to provide an output result. The technique also allows the developer to create new rules and/or supplemental models.
    Type: Grant
    Filed: January 23, 2016
    Date of Patent: September 17, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Young-Bum Kim, Ruhi Sarikaya, Alexandre Rochette
  • Patent number: 10417333
    Abstract: An apparatus and method for executing an application may execute a text string selected by a user or an application associated with a type of text string to input the text string to increase the user's convenience. The apparatus of executing an application includes a text string recognizer to determine a text string, a determiner to determine one or more candidate applications related to the text string, and an input location of the selected text string based on a type of the selected text string and the association model, an application list provider to generate and display a list of the candidate applications, and an application executer to execute a candidate application selected from the list and to input the selected text string into the input location of the candidate application.
    Type: Grant
    Filed: April 28, 2015
    Date of Patent: September 17, 2019
    Assignee: Samsung Electronics Co., Ltdd.
    Inventors: Ji Hyun Lee, Kyoung Gu Woo, Seok Jin Hong, Yo Han Roh, Sang Hyun Yoo, Ho Dong Lee
  • Patent number: 10402495
    Abstract: In one embodiment, a sequence of input words is received. Each of the input words is encoded as an indicator vector, wherein a sequence of the indicator vectors captures features of the sequence of input words. The sequence of the indicator vectors is then mapped to a distribution of a contextual probability of a first output word in a sequence of output words. For each subsequent output word, the sequence of the indicator vectors is encoded with a context, wherein the context comprises a previously mapped contextual probability distribution of a fixed window of previous output words; and the encoded sequence of the indicator vectors and the context is mapped to the distribution of the contextual probability of the subsequent output word. Finally, a condensed summary is generated using a decoder by maximizing the contextual probability of each of the output words.
    Type: Grant
    Filed: September 1, 2017
    Date of Patent: September 3, 2019
    Assignee: Facebook, Inc.
    Inventors: Alexander Matthew Rush, Sumit Chopra, Jason Edward Weston
  • Patent number: 10402473
    Abstract: A system for comparing, and generating pairwise revision markings with respect to, an original text segment and a revised text segment that can include identifying pairwise unchanged text, marking the pairwise unchanged text in the revised text segment with a distinct visual style that signifies its status as such, attempting to successively identify pairwise text divergence points and pairwise text convergence points within the original text segment and the revised text segment, and taking further steps with respect to text that occurs between a pairwise text divergence point and a pairwise text convergence point and/or after a pairwise text divergence point (which steps can include copying text from the original text segment into the revised text segment and marking the copied text, if any, with a distinct visual style that signifies such text as either deleted text or inserted text, as applicable).
    Type: Grant
    Filed: October 15, 2017
    Date of Patent: September 3, 2019
    Inventor: Richard Salisbury
  • Patent number: 10394956
    Abstract: The present disclosure includes a method for constructing an intelligent knowledge base.
    Type: Grant
    Filed: December 23, 2016
    Date of Patent: August 27, 2019
    Assignee: Shanghai Xiaoi Robot Technology Co., Ltd.
    Inventors: Yongmei Zeng, Bo Li, Gongzhi Yao, Pinpin Zhu
  • Patent number: 10387010
    Abstract: A method of computerized presentation of a document set view for auditing information of a set of documents. The method includes the initial step of receiving on a computer a selection of an original document. The original document has multiple pages with each of the pages of the original document having corresponding page content. A selection is received from the user of a first region a page of the original document. This process is repeated to retrieve a text string from all of the pages. An addendum document with multiple pages is received. A text string is retrieved from the pages of the addendum document without user intervention. A document set view is provided using the retrieved text stings and displayed for the user to update the associated information and thus allowing for the user to perform a data audit of the automated portion of the process.
    Type: Grant
    Filed: February 13, 2017
    Date of Patent: August 20, 2019
    Assignee: Bluebeam, Inc.
    Inventors: Joseph W. Wezorek, Elliot Chenault
  • Patent number: 10387575
    Abstract: Embodiments described herein provide a more flexible, effective, and computationally efficient means for determining multiple intents within a natural language input. Some methods rely on specifically trained machine learning classifiers to determine multiple intents within a natural language input. These classifiers require a large amount of labelled training data in order to work effectively, and are generally only applicable to determining specific types of intents (e.g., a specifically selected set of potential inputs). In contrast, the embodiments described herein avoid the use of specifically trained classifiers by determining inferred clauses from a semantic graph of the input. This allows the methods described herein to function more efficiently and over a wider variety of potential inputs.
    Type: Grant
    Filed: January 30, 2019
    Date of Patent: August 20, 2019
    Assignee: BABYLON PARTNERS LIMITED
    Inventors: April Tuesday Shen, Francesco Moramarco, Nils Hammerla, Pietro Cavallo, Olufemi Awomosu, Aleksandar Savkov, Jack Flann
  • Patent number: 10387560
    Abstract: A method, system and computer-usable medium are disclosed for automating the generation of table-based groundtruth, comprising: receiving a document comprising unstructured text and a table; generating questions by applying a template the contents of the table; performing QA pair generation operations on the table to generate QA pairs, each QA pair comprising a question generated by applying the template; and, assigning a score to each QA pair, the score providing an indicator of user interest to each QA pair, the score being based on a score generation methodology using the unstructured text and the table.
    Type: Grant
    Filed: December 5, 2016
    Date of Patent: August 20, 2019
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Anne E. Gattiker, Joseph N. Kozhaya
  • Patent number: 10380246
    Abstract: Techniques disclose validating user-provided text feedback for topical relevance relative to a question asked. A form with at least a first field is received. The first field includes unstructured text content provided as feedback in response to a question. The unstructured text content of the first field is evaluated to identify an answer type. A measure of relevance of the unstructured text content relative to the question is determined based on the evaluation.
    Type: Grant
    Filed: December 18, 2014
    Date of Patent: August 13, 2019
    Assignee: International Business Machines Corporation
    Inventors: Adam T. Clark, Jeffrey K. Huebert, Aspen L. Payton, John E. Petri
  • Patent number: 10373607
    Abstract: A method, for testing words defined in a pronunciation lexicon used in an automatic speech recognition (ASR) system, is provided. The method includes: obtaining test sentences which can be accepted by a language model used in the ASR system. The test sentences cover words defined in the pronunciation lexicon. The method further includes obtaining variations of speech data corresponding to each test sentence, and obtaining a plurality of texts by recognizing the variations of speech data, or a plurality of texts generated by recognizing the variation of speech data. The method also includes constructing a word graph, using the plurality of texts, for each test sentence, where each word in the word graph corresponds to each word defined in the pronunciation lexicon; and determining whether or not all or parts of words in a test sentence are present in a path of the word graph derived from the test sentence.
    Type: Grant
    Filed: June 13, 2017
    Date of Patent: August 6, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Takashi Fukuda, Osamu Ichikawa, Futoshi Iwama
  • Patent number: 10372816
    Abstract: Natural language processing of raw text data for optimal sentence boundary placement. Raw text is extracted from a document and subject to cleaning. The extracted raw text is examined to identify preliminary sentence boundaries, which are used to identify potential sentences in the raw text. One or more potential sentences are assigned a well-formedness score. A value of the score correlates to whether the potential sentence is a truncated/ill-formed sentence or a well-formed sentence. One or more preliminary sentence boundaries are optimized depending on the value of the score of the potential sentence(s). Accordingly, the processing herein is an optimization that creates a sentence boundary optimized output.
    Type: Grant
    Filed: December 13, 2016
    Date of Patent: August 6, 2019
    Assignee: International Business Machines Corporation
    Inventors: Charles E. Beller, Chengmin Ding, Allen Ginsberg, Elinna Shek
  • Patent number: 10372820
    Abstract: A method and system to identify similar names and addresses from given data set comprising plurality of names and addresses. The invention more specifically addresses the challenge faced in Spanish data quality assurance. The name and data is parsed through parsing engine to parse the plurality of Spanish names and addresses. The parsed Spanish names and addresses are sent to a Probable identification engine to identify the probable matches. The combination of name and address matching process can be used for assuring data quality for Spanish names and addresses. The Spanish name matching process consists of identification of probable matches and finding similarity percentages between those probable. Similarly, the Spanish address matching process consists of identification of probable matches (criteria like same city) and finding similarity percentages between those probable. The system includes a parsing engine, a probable identification engine and a match percentage calculation engine.
    Type: Grant
    Filed: March 6, 2019
    Date of Patent: August 6, 2019
    Assignee: Tata Consultancy Services Limited
    Inventors: Ashish Diwan, Nandish Kirtikumar Solanki, Sridhar G. Pattar, Sudhir Kumar
  • Patent number: 10366171
    Abstract: Exemplary embodiments relate to techniques for improving a machine translation system. The machine translation system may include one or more models for generating a translation. The system may generate multiple candidate translations, and may present the candidate translations to different groups of users, such as users of a social network. User engagement with the different candidate translations may be measured, and the system may determine which of the candidate translations was most favored by the users. For example, in the context of a social network, the number of times that the translation is liked or shared, or the number of comments associated with the translation, may be used to determine user engagement with the translation. The models of the machine translation system may be modified to favor the most-favored candidate translation. The translation system may repeat this process to continue to tune the models in a feedback loop.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: July 30, 2019
    Assignee: FACEBOOK, INC.
    Inventors: Ying Zhang, Fei Hung, Kay Rottmann, Necip Fazil Ayan
  • Patent number: 10353987
    Abstract: Examples herein disclose obtaining regions of digital content and determining a correlation measurement between the multiple regions of digital content adjacently located to each other. The examples disclose identifying a breakpoint in the digital content based on the determined correlation measurement.
    Type: Grant
    Filed: January 30, 2015
    Date of Patent: July 16, 2019
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Shanchan Wu, Lei Liu, Jerry Liu
  • Patent number: 10346537
    Abstract: A likely source language of a media item can be identified by attempting an initial language identification of the media item based on intrinsic or extrinsic factors, such as words in the media item and languages known by the media item author. This initial identification can generate a list of most likely source languages with corresponding likelihood factors. Translations can then be performed presuming each of the most likely source languages. The translations can be performed for multiple output languages. Each resulting translation can receive a corresponding score based on a number of factors. The scores can be combined where they have a common source language. These combined scores can be used to weight the previously identified likelihood factors for the source languages of the media item.
    Type: Grant
    Filed: August 9, 2017
    Date of Patent: July 9, 2019
    Assignee: FACEBOOK, INC.
    Inventor: Fei Huang
  • Patent number: 10346456
    Abstract: A method and a system for efficient search of string patterns characterized by positional relationships in a character stream are disclosed. The method is based on grouping string patterns of a dictionary into at least two string sets and performing string search processes of a text of the character stream based on individual string sets with the outcome of a search process influencing a subsequent search process. A system implementing the method comprises a dictionary processor for generating string sets with corresponding text actions and search actions, a conditional search engine for locating string patterns belonging to at least one string set in a text according to a current search state, a text operator for producing an output text according to search results, and a search operator for determining a subsequent search state.
    Type: Grant
    Filed: October 3, 2016
    Date of Patent: July 9, 2019
    Assignee: TREND MICRO INCORPORATED
    Inventor: Kevin Gerard Boyce
  • Patent number: 10339122
    Abstract: A computer-implemented linking system and method provide for linking actionable phrases in a first document to other documents in a document corpus. The method includes identifying at least one actionable phrase in a first document. The actionable phrase may include an action, its direct object, and any modifier of the direct object. For each identified action phrase the document corpus is searched to identify other documents, which are scored using a scoring function which takes into account occurrences of words of the actionable phrase in each identified document. The actionable phrase is linked to at least a part of one of the most highly ranked documents in the set of documents.
    Type: Grant
    Filed: September 10, 2015
    Date of Patent: July 2, 2019
    Assignee: Conduent Business Services, LLC
    Inventors: Nikolaos Lagos, Matthias Gallé, Alexandr Chernov
  • Patent number: 10339168
    Abstract: Embodiments provide a computer implemented method, in a data processing system comprising a processor and a memory comprising instructions which are executed by the processor to cause the processor to implement a full question generation system, the method comprising ingesting, into the full question generation system, a query dataset derived from one or more search queries entered by one or more users of an internet search engine; identifying questions from the ingested query dataset; separating, through a full question identification module, one or more prior full questions from the ingested dataset; identifying, through a question intent query identification module, one or more question intent queries from the query dataset; for each identified question intent query: sorting, through a sorting module, the question intent query into one or more bins based on one or more missing interrogative words; and appending, through an appending module, the missing interrogative word and a verb onto the question intent
    Type: Grant
    Filed: September 9, 2016
    Date of Patent: July 2, 2019
    Assignee: International Business Machines Corporation
    Inventors: Bryn R. Dole, William S. Ko, Malous M. Kossarian, Douglas A. Smith
  • Patent number: 10339216
    Abstract: Selecting a grammar for use in a machine question-answering system, such as a Natural Language Understanding System, can be difficult for non-experts in such grammars. A tool, according to an example embodiment, can compare annotations of sample sentences, performed correctly by a human, the annotations having intents and mentions, against annotations performed by multiple grammars. Each grammar can be scored, and the system can select the best scored grammar for the user. In one embodiment, a method of selecting a grammar includes comparing manually-generated annotations against machine-generated annotations as a function of a given grammar among multiple grammars. The method can further include applying scores to the machine-generated annotations that are a function of weightings of the intents and mentions. The method can additionally include recommending whether to employ the given grammar based on the scores.
    Type: Grant
    Filed: July 26, 2013
    Date of Patent: July 2, 2019
    Assignee: Nuance Communications, Inc.
    Inventor: Jeffrey N. Marcus
  • Patent number: 10339223
    Abstract: A text processing system that is able to appropriately determine textual entailment between sentences with high coverage is provided. The text processing system is configured to execute: processing of extracting a common substructure that is a partial structure of a same type, the partial structure being common to a first sentence and a second sentence and, based on the a structure representing the first sentence and a structure representing the second sentence; processing of extracting at least one of a feature amount representing a dependency relationship between the at least one common substructure in the first and second sentences and a feature amount representing a dependency relationship between the common substructure in the first and second sentences and a substructure different from the common substructure; and processing of determining an entailment relationship between the first sentence and the second sentence by using the extracted feature amount.
    Type: Grant
    Filed: August 20, 2015
    Date of Patent: July 2, 2019
    Assignee: NEC CORPORATION
    Inventors: Shumpei Kubosawa, Masaaki Tsuchida, Kai Ishikawa