Based On Phrase, Clause, Or Idiom Patents (Class 704/4)
  • Patent number: 10740365
    Abstract: Embodiments of the present invention disclose a method, a computer program product, and a computer system for identifying information gaps in corpora. A computer receives a document and extracts keywords from the document while filtering trivial keywords. The computer identifies and extracts top keywords detailed by the document using a topic modelling approach before determining whether the extracted top keywords exceed a threshold use frequency. Based on determining that the top keywords exceed a threshold use frequency, determining whether the top keywords have a relation to other entities within the document and, if so, determining whether the top keywords are defined within the document. Based on determining that the top keywords are not defined in the document, adding the top keywords to a list and defining the top keywords.
    Type: Grant
    Filed: June 14, 2017
    Date of Patent: August 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Brendan C. Bull, Scott R. Carrier, Aysu Ezen Can, Dwi Sianto Mansjur
  • Patent number: 10708203
    Abstract: Aspects of the present disclosure generally relate to systems and methods that allow a user of an electronic device, who is engaged in communicating with one or more other users, to convey an emotional context with that communication using an image created by the user.
    Type: Grant
    Filed: June 25, 2015
    Date of Patent: July 7, 2020
    Assignee: Convergence Acceleration Solutions, LLC
    Inventors: Drewry Hampton Morris, Jared Trotter
  • Patent number: 10706227
    Abstract: A system of provisioning electronic forms based on natural language is disclosed. Accordingly, the system may include a communication device configured for receiving a natural language input from a builder device. Further, the natural language input represents one or more of a requested data and a presented data associated with a not-yet-legal process. Further, the communication device may be configured for transmitting an electronic form to at least one user device, receiving the electronic form including the requested data from the user device. Further, the requested data may include environmental data captured from at least one sensor of the user device. Further, the system may include a processing device configured for analyzing the natural language input. Further, the processing device may be configured for generating the electronic form based on the analyzing of the natural language input. Further, the system may include a storage device storing the electronic form.
    Type: Grant
    Filed: February 28, 2019
    Date of Patent: July 7, 2020
    Inventor: Morgan Warstler
  • Patent number: 10705748
    Abstract: Embodiments of the present application disclose a method and a device for file name identification and file cleaning. The method for file name identification comprises: determining a set of files to be processed; obtaining a string corresponding to the name of each file included in the set of files to be processed; for the obtained string corresponding to the name of each file, detecting whether a regular expression template matching the string is present in a preset regular expression template library; determining the detected regular expression template as a regular expression template of the file name corresponding to the string; identifying a regular expression of the file name corresponding to the string according to the determined regular expression template.
    Type: Grant
    Filed: August 12, 2016
    Date of Patent: July 7, 2020
    Inventor: Guoqiang Jiao
  • Patent number: 10701433
    Abstract: A method comprising: enabling a first user to define a message for display to at least a second user in association with a first three-dimensional scene viewed by the first user and viewed by or viewable by the second user, wherein the message comprises user-defined message content for display and message metadata, not for display, defining first three-dimensional spatial information; and enabling rendering of the user-defined message content in a second three-dimensional scene viewed by the second user, wherein the user-defined message content moves, within the second three-dimensional scene, along a three-dimensional trajectory dependent upon the first three-dimensional spatial information and three-dimensional spatial information of the second user.
    Type: Grant
    Filed: June 13, 2017
    Date of Patent: June 30, 2020
    Assignee: Nokia Technologies Oy
    Inventors: Yu You, Lixin Fan, Tinghuai Wang
  • Patent number: 10698639
    Abstract: Disclosed is a method of provisioning electronic forms based on natural language. The method includes receiving, using a communication device, a natural language input from a builder device, wherein the natural language input represents one or both of a requested data and a presented data associated with a legal process. Further, the method includes analyzing, using a processing device, the natural language input. Yet further, the method includes generating, using the processing device, an electronic form based on the analyzing of the natural language input, wherein the electronic form comprises an input field configured to receive the requested data and an output field configured to present the presented data. Moreover, the method includes storing, using a storage device, the electronic form. Furthermore, the method includes transmitting, using the communication device, the electronic form to one or more user devices.
    Type: Grant
    Filed: February 28, 2019
    Date of Patent: June 30, 2020
    Inventor: Morgan Warstler
  • Patent number: 10699074
    Abstract: Methods, mobile electronic devices, and computer program products are provided for accepting reduced text entry of phrases, sentences or paragraphs, and probabilistically determining the most likely translation of the reduced text to a full text counterpart, and displaying same. Reduced text is accepted and parsed according to a predefined reduction pattern to produce parsed text elements. The parsed text elements are evaluated using n-gram knowledge and/or language models to identify the most likely words corresponding to the elements. The most likely corresponding words are used to evaluate the reduced text at the phrase level by evaluating the likelihood of transition from one word to the next amongst the most likely words, to compute phrase probabilities for various combinations of the most likely words. The most likely phrase(s) are output based in part on the phrase probabilities.
    Type: Grant
    Filed: May 22, 2018
    Date of Patent: June 30, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Claes-Fredrik Urban Mannby
  • Patent number: 10686742
    Abstract: One or more computing devices, systems, and/or methods for adjusting recipients of a message are provided. For example, a trigger item may be detected in a first input field, of a messaging interface, corresponding to a body of a message. A list of user identifications may be generated and/or displayed. A first content item may be detected in the second input field following the trigger item. A second list of user identifications may be generated based upon the first content item and/or the second list of user identifications may be displayed. A first user identification may be selected by receiving a selection of the first user identification from the second list of user identifications. A first contact item associated with the first user identification may be entered into one or more second input fields corresponding to one or more recipients of the message.
    Type: Grant
    Filed: April 29, 2018
    Date of Patent: June 16, 2020
    Assignee: Oath Inc.
    Inventors: Peter John Genovese, Fang Xu, Markandey Singh, Leung Wai Chan, Chuan Tian Zhang
  • Patent number: 10671801
    Abstract: A markup generation system generates a markup file that can be interpreted in a consistent manner by different markup viewers. The markup generation system includes inert variables declarations and markers in the markup file. The markup generation system determines a position in a code segment in the markup file for placing the attribute value based on the marker in the code segment. The attribute value can be used by markup viewers to interpret the markup file.
    Type: Grant
    Filed: May 25, 2017
    Date of Patent: June 2, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventor: Aleksandar Arsovski
  • Patent number: 10672390
    Abstract: The systems and methods disclosed herein combine a plurality of interpretations of a voice-based input. The systems and methods may receive the voice-based input, process it using one or more automatic speech recognition modules to obtain a plurality of interpretations, and identify an entity set for each of the plurality of interpretations. The systems and methods may further generate a combined interpretation based on a first interpretation and second interpretation selected form the plurality of interpretations and assign a semantic score to the combined interpretation based on the entity sets of the first and second interpretation.
    Type: Grant
    Filed: December 22, 2014
    Date of Patent: June 2, 2020
    Assignee: Rovi Guides, Inc.
    Inventors: Abubakkar Siddiq, Sashikumar Venkataraman
  • Patent number: 10657537
    Abstract: Systems and methods are provided for use in asynchronous processing of events within a network. One exemplary method includes receiving multiple events for asynchronous processing, each defined by at least one rule, and assigning, by a computing device, the multiple events to an event queue. The method also includes retrieving, by the computing device, a first one of the multiple events from the event queue; transforming, by the computing device, the first one of the multiple events into a first event object; and recording, by the computing device, the first event object to a data structure. The method further includes identifying, by the computing device, at least one notification message associated with the first event object and causing, by the computing device, the at least one notification message, associated with the first event object, to be delivered.
    Type: Grant
    Filed: November 2, 2015
    Date of Patent: May 19, 2020
    Assignee: MASTERCARD INTERNATIONAL INCORPORATED
    Inventors: Jason Revelle, Jeff Hammontree, Richard M. Navarro, Robert C. Schupp
  • Patent number: 10650190
    Abstract: Techniques for rule creation from natural language text (NLT) are disclosed. In an embodiment, a rule statement in the NLT is grammatically parsed to obtain a parse tree. Further, each of noun phrase (NP) core sub-tree and verb phrase (VP) core sub-tree, in the parse tree, is partitioned into at least one sub-tree. Furthermore, one or more operators are extracted using at least one sub-tree of NP or VP core sub-trees. The rule statement is substantially simultaneously splitted into groups of 3 adjacent words (3-grams). Each of the 3-grams is then compared with a predefined list to extract data properties and class concepts associated with the rule statement. Moreover, subscripts are assigned to the data properties, class concepts and operators. A rule head is then created based on the data properties and operators. Also, a rule is created using the data properties, operators, class concepts, associated subscripts and rule head.
    Type: Grant
    Filed: July 11, 2017
    Date of Patent: May 12, 2020
    Assignee: Tata Consultancy Services Limited
    Inventors: Suman Roychoudhury, Nikhil Narendra Bellarykar, Vinay Kulkarni
  • Patent number: 10649545
    Abstract: A communication system for use by individuals whose ability to speak is restricted, such as intubated patients in a hospital setting. Apparatus is provided to allow such a person to input his or her responses to topics of interest and thereby to communicate with others, such as medical professionals. In some embodiments the input is a joystick-like device that allows selections to be made in response to displayed option on a screen. In some embodiments, a digitally operated setoff keys is provided. The user's inputs are processed in a general purpose programmable computer that operates under the control of a set of non-volatile instructions recoded on a machine-readable medium.
    Type: Grant
    Filed: December 12, 2016
    Date of Patent: May 12, 2020
    Assignee: University of Massachusetts
    Inventor: Miriam Madsen
  • Patent number: 10635722
    Abstract: There are provided decentralized system and method of managing electronic documents of title (EDTs).
    Type: Grant
    Filed: April 20, 2016
    Date of Patent: April 28, 2020
    Assignee: OGY DOCS, INC.
    Inventors: Gad Ruschin, Or Garbash, Yair Sappir
  • Patent number: 10637826
    Abstract: An online system determines whether a test content item violates a policy of the online system. The online system extracts a semantic from the test content item and determines a distance between the extracted semantic vector and the stored semantic vectors for content items that have been labeled to indicate whether they violate a policy. Using a nearest neighbor search, the online system selects a set of the stored semantic vectors and assigns a weight to the selected semantic vectors that is inversely related to the distances. The online system then determines whether the test content item violates a policy using a weighed voting scheme, where the labels of the stored semantic vectors are aggregated based on their associated weights. The online system may first attempt to match the test content with known bad content and terminate the more complex nearest neighbor search if such a match is found.
    Type: Grant
    Filed: August 6, 2018
    Date of Patent: April 28, 2020
    Assignee: Facebook, Inc.
    Inventors: Enming Luo, Emanuel Alexandre Strauss
  • Patent number: 10614100
    Abstract: A method comprising using at least one hardware processor for: receiving a topic under consideration (TUC) and a set of claims referring to the TUC; identifying semantic similarity relations between claims of the set of claims; clustering the claims into a plurality of claim clusters based on the identified semantic similarity relations, wherein said claim clusters represent semantically different claims of the set of claims; and generating a list of non-redundant claims comprising said semantically different claims.
    Type: Grant
    Filed: April 29, 2015
    Date of Patent: April 7, 2020
    Assignee: International Business Machines Corporation
    Inventors: Mitesh Khapra, Vikas Raykar, Amrita Saha, Noam Slonim, Ashish Verma
  • Patent number: 10615965
    Abstract: A messaging service provides a search mechanism that utilizes a protected index. The protected index is generated by converting documents maintained by the messaging service into a set of tokens or words. Each token is converted to a corresponding value using a transformation such as a cryptographic hash function. The values are placed into an index that allows the messaging service to efficiently identify a set of documents associated with each particular value. When a document search request is submitted to the messaging service, the messaging service uses the transformation to generate corresponding values for each term in the search request, and uses the index to identify sets of documents associated with the values corresponding to the search terms. The messaging service applies search logic associated with the search request to the identified sets of documents to produce a final set of documents satisfying search request.
    Type: Grant
    Filed: March 22, 2017
    Date of Patent: April 7, 2020
    Assignee: Amazon Technologies, Inc.
    Inventor: Matthew E. Goldberg
  • Patent number: 10599731
    Abstract: Described is a technique for associating words used in a search query with categories. This technique aims to produce potentially more relevant search results by improving the associations with words used for a search. A machine learning technique is implemented to train a classification model, which may include a word embedding model. The classification model is trained to receive words as input and to create vectors of the words as output. These word vectors may then be mapped to a vector space and the technique may then perform a cluster analysis of the vectors. Based on the cluster analysis, clusters may be identified and each cluster may be associated with a corresponding category.
    Type: Grant
    Filed: April 26, 2016
    Date of Patent: March 24, 2020
    Assignee: BAIDU USA LLC
    Inventors: Yu Zhu, Lin Li
  • Patent number: 10599885
    Abstract: Systems, devices, and methods of the present invention uses noisy-robust discourse trees to determine a rhetorical relationship between one or more sentences. In an example, a rhetoric classification application creates a noisy-robust communicative discourse tree. The application accesses a document that includes a first sentence, a second sentence, a third sentence, and a fourth sentence. The application identifies that syntactic parse trees cannot be generated for the first sentence and the second sentence. The application further creates a first communicative discourse tree from the second, third, and fourth sentences and a second communicative discourse tree from the first, third, and fourth sentences. The application aligns the first communicative discourse tree and the second communicative discourse tree and removes any elementary discourse units not corresponding to a relationship that is in common between the first and second communicative discourse trees.
    Type: Grant
    Filed: June 15, 2018
    Date of Patent: March 24, 2020
    Assignee: Oracle International Corporation
    Inventor: Boris Galitsky
  • Patent number: 10594638
    Abstract: An electronic chat session monitoring device intercepts a text message from an electronic chat session. The text message is generated by a sender and addressed to an addressee. The electronic chat session monitoring device receives a current photo of the sender of the text message electronic chat session, which is taken contemporaneously with a generation of the text message by the sender and depicts an emotion of the sender while generating the text message. The electronic chat session monitoring device then transmits both the text message and the current photo of the sender to the addressee.
    Type: Grant
    Filed: February 13, 2015
    Date of Patent: March 17, 2020
    Assignee: International Business Machines Corporation
    Inventors: James E. Bostick, John M. Ganci, Jr., Sarbajit K. Rakshit, Kimberly G. Starks
  • Patent number: 10585985
    Abstract: Methods and systems for scoring written text based on use of idiomatic expressions, including reading pre-selected idiomatic expressions in a canonical form into memory, expanding idiomatic expressions from the canonical form, reading a written response into the memory, pre-processing the written response, searching the pre-processed written response for idiomatic expressions, and assigning a score to the written response. The score may be based at least in part on the number of idiomatic expressions in the written response. Corresponding apparatuses, systems, and methods are also disclosed.
    Type: Grant
    Filed: December 14, 2017
    Date of Patent: March 10, 2020
    Assignee: Educational Testing Service
    Inventors: Michael Flor, Beata Beigman Klebanov
  • Patent number: 10574607
    Abstract: A mechanism is provided for validating an attachment to an electronic communication being composed based on the recipients of the electronic communication. An associated tone or theme of the at least one attachment to the electronic communication being composed by a sender and an identity of each of one or more recipients to whom the electronic communication is to be sent and the sender are identified. One or more previous electronic communications sent to or received from one or more of the one or more recipients and at least one tone of each of the one or more previous electronic communications are identified in order to generate one or more preferred tones. Responsive to identifying a tone discrepancy between the tone or theme of the at least one attachment and the one or more preferred tones, a notification is presented to the sender.
    Type: Grant
    Filed: May 18, 2016
    Date of Patent: February 25, 2020
    Assignee: International Business Machines Corporation
    Inventors: Florian Pinel, Edward E. Seabolt
  • Patent number: 10574605
    Abstract: A mechanism is provided for validating the tone of an electronic communication being composed based on the recipients of the electronic communication. At least one tone of the electronic communication being composed by a sender and an identity of each of one or more recipients to whom the electronic communication is to be sent and the sender are identified. One or more previous electronic communications sent to or received from one or more of the one or more recipients and at least one tone of each of the one or more previous electronic communications are identified in order to generate one or more preferred tones. The tone of the electronic communication being composed is compared to the one or more preferred tones. Responsive to identifying a tone discrepancy between the electronic communication being composed and the one or more preferred tones, a notification is presented to the sender.
    Type: Grant
    Filed: May 18, 2016
    Date of Patent: February 25, 2020
    Assignee: International Business Machines Corporation
    Inventors: Florian Pinel, Edward E. Seabolt
  • Patent number: 10552107
    Abstract: In one example of the disclosure, a set of electronic document templates is accessed and instances of duplicated document content are identified. Display of a user notice for first duplicated document content is caused. Responsive to receipt of data indicative of a user instruction to create a component template for the first duplicated content, the component template is created and stored.
    Type: Grant
    Filed: December 2, 2015
    Date of Patent: February 4, 2020
    Assignee: OPEN TEXT CORPORATION
    Inventors: James Matthew Downs, Billy R. Kidwell, Anthony Wiley
  • Patent number: 10554863
    Abstract: An image forming apparatus includes a display device, an input device configured to receive a user operation, a preview processing unit configured to display as preview on the display device a document image as an output target, and an output processing unit configured to output the document image. Further, the preview processing unit (a) detects an area surrounded by a predetermined specific color in the document image and displays the document image as preview on the display device so as to make the detected area editable, and (b) add a text specified by the user operation in the detected area in accordance with the user operation. The output processing unit outputs the document image in which the text was added.
    Type: Grant
    Filed: May 7, 2018
    Date of Patent: February 4, 2020
    Assignee: Kyocera Document Solutions, Inc.
    Inventor: Koji Tagaki
  • Patent number: 10482876
    Abstract: A speech interpretation module interprets the audio of user utterances as sequences of words. To do so, the speech interpretation module parameterizes a literal corpus of expressions by identifying portions of the expressions that correspond to known concepts, and generates a parameterized statistical model from the resulting parameterized corpus. When speech is received the speech interpretation module uses a hierarchical speech recognition decoder that uses both the parameterized statistical model and language sub-models that specify how to recognize a sequence of words. The separation of the language sub-models from the statistical model beneficially reduces the size of the literal corpus needed for training, reduces the size of the resulting model, provides more fine-grained interpretation of concepts, and improves computational efficiency by allowing run-time incorporation of the language sub-models.
    Type: Grant
    Filed: October 1, 2018
    Date of Patent: November 19, 2019
    Assignee: Interactions LLC
    Inventors: Ethan Selfridge, Michael Johnston
  • Patent number: 10460034
    Abstract: An intention inference system includes, a morphological analyzer to perform morphological analysis for a complex sentence with multiple intentions involved, a syntactic analyzer to perform syntactic analysis for the complex sentence morphologically analyzed by the morphological analyzer and to divide it into the first simple sentence and the second simple sentence, an intention inference unit to infer the first intention involved in the first simple sentence and the second intention involved in the second simple sentence, a feature extractor to extract as the first feature a morpheme showing execution order of operations involved in the first simple sentence and to extract as the second feature a morpheme showing execution order of operations involved in the second simple sentence, and an execution order inference unit to infer the execution order of the first operation corresponding to the first intention and the second operation corresponding to the second intention on the basis of the first feature and the
    Type: Grant
    Filed: January 28, 2015
    Date of Patent: October 29, 2019
    Assignee: Mitsubishi Electric Corporation
    Inventors: Yi Jing, Yusuke Koji, Jun Ishii
  • Patent number: 10452698
    Abstract: An unstructured data analytics system, including: an unstructured data analytics algorithm resident on a server and accessible via a browser operable for receiving unstructured data from one or more remote sources, applying one or more analytical tools to the unstructured data, and displaying summary information to one or more users; wherein the summary information is displayed to the one or more users in a presentation layer, an exploration layer, and an annotation layer. The unstructured data analytics algorithm is also operable for receiving outside data from one or more remote sources. The presentation layer displays one or more of the unstructured data a summary of the unstructured data, and the summary information. The exploration layers allows the one or more users to modify the granularity of the summary information, thereby modifying the granularity of the presentation layer. The one or more users can interact with the unstructured data analytics system simultaneously via the annotation layer.
    Type: Grant
    Filed: May 11, 2016
    Date of Patent: October 22, 2019
    Assignee: Stratifyd, Inc.
    Inventor: Xiaoyu Wang
  • Patent number: 10444894
    Abstract: In an example implementation according to aspects of the present disclosure, a method may include capturing data entered on a touch sensitive mat or on an object physically disposed on the touch sensitive mat. The method further includes extracting the data from the captured image, and developing contextual information from the data extracted from the captured image. The method further includes projecting the contextual information onto the touch sensitive mat or onto the object.
    Type: Grant
    Filed: September 12, 2014
    Date of Patent: October 15, 2019
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Immanuel Amo, Diogo Lima, Nicholas P Lyons, Arman Alimian
  • Patent number: 10446137
    Abstract: Systems, components, devices, and methods for resolving ambiguity in a conversational understanding system are provided. A non-limiting example is a system or method for resolving ambiguity in a conversational understanding system. The method includes the steps of receiving a natural language input and identifying an agent action based on the natural language input. The method also includes the steps of determining an ambiguity value associated with the agent action and evaluating the ambiguity value against an ambiguity condition. The method includes the steps of when determined that the ambiguity value meets the ambiguity condition: selecting a prompting action based on the ambiguity associated with the identified agent action, performing the prompting action, receiving additional input in response to the prompting action, and updating the agent action to resolve the ambiguity based on the additional input. The method also includes the step of performing the agent action.
    Type: Grant
    Filed: October 19, 2016
    Date of Patent: October 15, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Omar Zia Khan, Ruhi Sarikaya, Divya Jetley
  • Patent number: 10437932
    Abstract: A determination method executed by a computer including a memory and a processor coupled to the memory, includes receiving a plurality of sentences and designation of terms included in the plurality of sentences, generating, for each term for which the designation is received, information indicating a relation between the term and each of other terms included in one of the plurality of sentences containing the term, extracting, for each term for which the designation is received, information indicating a specific relation from the generated information indicating the relation, generating characteristic information that uses the extracted information as a feature, and determining similarity between a plurality of the terms based on the generated characteristic information for each term.
    Type: Grant
    Filed: March 19, 2018
    Date of Patent: October 8, 2019
    Assignee: FUJITSU LIMITED
    Inventors: Kazuo Mineno, Nobuko Takase, Naohiro Itou
  • Patent number: 10437872
    Abstract: A computer implemented and computer controlled method of arranging data for processing and storage thereof at a data storage engine. To identified data elements, an action is assigned from a plurality of actions as well as an association between data elements of an action according to a respective topology comprised of an ordered plurality of data categories including a subject data category, an object data category, a spatial data category and a temporal data category. By matching the identified data elements with action topology combinations and using the order of the data elements, one data element is matched with one data category. Instance information is supplemented to matched action topology combinations. In a computer readable format, at a data storage engine, identified data elements, instance information and associations between identifiers resulting from identifying, assigning, matching and supplementing are stored.
    Type: Grant
    Filed: May 26, 2017
    Date of Patent: October 8, 2019
    Assignee: DYNACTIONIZE N.V.
    Inventor: Michael Rik Frans Brands
  • Patent number: 10423665
    Abstract: The present teaching relates to generating a conversational agent. In one example, a plurality of input utterances may be received from a developer. A paraphrase model is obtained. The paraphrase model is generated based on machine translation. For each of the plurality of input utterances, one or more paraphrases of the input utterance are generated based on the paraphrase model. For each of the plurality of input utterances, at least one of the one or more paraphrases is selected based on an instruction from the developer to generate selected paraphrases. The conversational agent is generated based on the plurality of input utterances and the selected paraphrases.
    Type: Grant
    Filed: August 2, 2017
    Date of Patent: September 24, 2019
    Assignee: Oath Inc.
    Inventors: Ankur Gupta, Timothy Daly, Tularam Ban
  • Patent number: 10423615
    Abstract: The method includes monitoring a computing device for language settings during user-generated content creation and detect one or more language settings. The method further includes analyzing user-created content to detect a language from a text of the user-generated content. The method further includes compiling a list of scored preferred languages for the computing device based on the detected language settings and the detected language of the text. The method further includes intercepting a query from the computing device. The method further includes analyzing a text of the intercepted query in a plurality of selected languages based on a language setting of a user interface application, a detected language of the query, and a predetermined number of preferred languages of the computing device to produce results of analysis for each selected language. The method further includes generating a multilingual query based on the results of analysis for the selected languages.
    Type: Grant
    Filed: June 10, 2016
    Date of Patent: September 24, 2019
    Assignee: International Business Machines Corporation
    Inventors: Leonid Bolshinsky, Vladimir Gamaley, Sharon Krisher
  • Patent number: 10417269
    Abstract: A system and method for verbatim-text mining including parsing documents of a text corpus into a plurality of individual sentences, assigning a sentence identifier to one or more individual sentences of the plurality of individual sentences, generating a plurality of n-Gram strings comprising a plurality of n-Grams from words within the individual sentence, applying an inverted index to the n-Gram string, combining an index data structure of one n-Gram string with an index data structure of another n-Gram string forming a merged index data structure when the index data structure of one n-Gram string shares a predetermined percentage of sentence identifiers of the index data structure of another n-Gram string, assigning a group identifier to the merged index data structure of a one or more merged index data structures, and creating a data set comprising the sentence identifier, the group identifier and the associated n-Gram string.
    Type: Grant
    Filed: March 13, 2017
    Date of Patent: September 17, 2019
    Assignee: LexisNexis, a division of Reed Elsevier Inc.
    Inventor: Paul Zhang
  • Patent number: 10409919
    Abstract: A display method includes reading from a memory a language setting representing an original language and a first target language; detecting a first set of one or more characters input in the original language; recognizing the first set of one or more characters as first text; translating the first text from the original language to the first target language; displaying the translated first text on one or more display areas; translating the translated first text back to the original language; and displaying the first text translated back to the original language on the one or more display areas.
    Type: Grant
    Filed: September 28, 2015
    Date of Patent: September 10, 2019
    Assignee: Konica Minolta Laboratory U.S.A., Inc.
    Inventors: Howard Rubin, Isao Hayami
  • Patent number: 10409810
    Abstract: The method includes monitoring a computing device for language settings during user-generated content creation and detect one or more language settings. The method further includes analyzing user-created content to detect a language from a text of the user-generated content. The method further includes compiling a list of scored preferred languages for the computing device based on the detected language settings and the detected language of the text. The method further includes intercepting a query from the computing device. The method further includes analyzing a text of the intercepted query in a plurality of selected languages based on a language setting of a user interface application, a detected language of the query, and a predetermined number of preferred languages of the computing device to produce results of analysis for each selected language. The method further includes generating a multilingual query based on the results of analysis for the selected languages.
    Type: Grant
    Filed: May 8, 2015
    Date of Patent: September 10, 2019
    Assignee: International Business Machines Corporation
    Inventors: Leonid Bolshinsky, Vladimir Gamaley, Sharon Krisher
  • Patent number: 10394959
    Abstract: Machine training for determining sentiments in social network communications. A text document is extracted from a web site and tokenized into tokens. The tokens are input to a word to vector conversion model to generate word vectors. A term frequency inverse document frequency (TF-IDF) algorithm converts the word vectors to sentence vectors. A randomly selected subset the sentence vectors are tagged and used to train a classifier. The classifier takes a sentence vector and predicts a sentiment associated with the sentence vector. Predicted sentiment associated with each of the sentence vectors may be combined to generate a sentiment associated with the text document.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: August 27, 2019
    Assignee: International Business Machines Corporation
    Inventors: Ankur Tagra, Rajat Verma, Sudarshan Narayanan
  • Patent number: 10387529
    Abstract: The present invention may be a method, a system, and/or a computer program product. An embodiment of the present invention provides a method for paraphrasing, on a client computer, text in a webpage, the method comprising the following: transferring a request for a webpage including a plurality of passages of text to a server; receiving the webpage from the server in response to the request; judging whether or not the received webpage has text which is a subject of paraphrase; in a case where the judgment is positive, paraphrasing the text; and displaying, on a display, the webpage including the paraphrased text. Another embodiment of the present invention provides a method for updating on a server, text in a webpage, the method comprising the following: receiving, from each of the devices, a set of URLs of a webpage, a location path of text which is a subject of paraphrase in the webpage, and paraphrased text; and replacing text in the webpage with text among the received text.
    Type: Grant
    Filed: February 16, 2017
    Date of Patent: August 20, 2019
    Assignee: International Businesss Machines Corporation
    Inventors: Kentaroh Noji, Akihiko Takajo, Yukiko Yasuda
  • Patent number: 10380126
    Abstract: Systems and methods for content aggregation creation are disclosed herein. The system can include memory having a content database and an aggregation database. The system can include a user device having a first network interface and a first I/O subsystem. The system can include a server that can: provide content to the user device via a first electrical signal; receive a selection of a portion of the provided content from the user device via a second electrical signal; automatically extract sentences from the selected portion of the provided content via a natural language processor; automatically generate a parse tree for one of the automatically extracted sentences; identify noun phrases from the part of speech tags within the parse tree; place content associated with one of the noun phrase in a content aggregation; and output the content aggregation to the user device.
    Type: Grant
    Filed: December 13, 2016
    Date of Patent: August 13, 2019
    Assignee: PEARSON EDUCATION, INC.
    Inventors: Sean York, Tim Stewart, David Strong, Scott Hellman, William Murray
  • Patent number: 10372744
    Abstract: A computer scans a DITA library to identify DITA topic files. The computer then determines whether the identified DITA file has a concept, task, or reference scheme. Based on determining that the identified DITA topic file has a concept scheme, the computer generates a subject taxonomy. Based on determining that the identified DITA topic file has a task scheme, the computer generates a navigation taxonomy. Based on determining that the identified DITA topic file has a reference scheme, the computer generates a command relational taxonomy. Based on the generated subject, navigation, and command relational taxonomies, the computer generates a DITA file relationship table based on the contextual taxonomy density of the aforementioned taxonomies.
    Type: Grant
    Filed: June 3, 2016
    Date of Patent: August 6, 2019
    Assignee: International Business Machines Corporation
    Inventors: Balaji S. Kumar, Vishal G. Palliyathu, Harpreet Singh
  • Patent number: 10354010
    Abstract: An information processing system to increase weights of words that are related to a text, but that do not explicitly occur in the text, in a weight vector representing the text, is provided. An adjusting system (100) includes a distance storing unit (110) and an adjusting unit (120). The distance storing unit (110) stores distances between any two terms of a plurality of terms. The distance between two terms becomes smaller as the two terms are semantically more similar. The adjusting unit (120) adjusts a weight of each term of the plurality of terms in a weight vector including weights of the plurality of terms and representing a text, on the basis of a distance between each term and other term in the weight vector and a weight of the other term.
    Type: Grant
    Filed: April 24, 2015
    Date of Patent: July 16, 2019
    Assignee: NEC CORPORATION
    Inventors: Daniel Georg Andrade Silva, Akihiro Tamura, Masaaki Tsuchida
  • Patent number: 10354646
    Abstract: A third sentence obtained by replacing a first phrase of a first sentence with a second phrase is input, and it is judged whether a third phrase is included in a first database including at least a phrase used in written text. If the third phrase is not included, a first evaluation value in the first database is calculated for a seventh phrase obtained by replacing the second phrase of the third phrase with a sixth phrase. It is judged whether the third phrase is included in a second database including at least a phrase used in spoken text and whether a second evaluation value calculated from the first evaluation value satisfies a predetermined condition. If the third phrase is included, and the second evaluation value satisfies the predetermined condition, the third sentence and the second sentence as a pair are added to a bilingual corpus.
    Type: Grant
    Filed: August 29, 2017
    Date of Patent: July 16, 2019
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Nanami Fujiwara, Masaki Yamauchi, Masahiro Imade
  • Patent number: 10339029
    Abstract: A method of detecting potential internationalization issues in source code may include installing a plug-in component in a stand-alone static source code analysis program/application that is configured to enable detection of internationalization issues in source code. The method may also include automatically creating a repository comprising a plurality of internationalization rules for a plurality of programming languages that are provided by the plug-in and accessing a subset of the plurality of internationalization rules corresponding to a particular programming language of the plurality of programming languages. The method may include creating a quality profile for the particular programming language using the subset of the plurality of internationalization rules, scanning source code of a software product for potential issues on a block level, and identifying and displaying the detected internationalization issues in the source code.
    Type: Grant
    Filed: November 1, 2016
    Date of Patent: July 2, 2019
    Assignee: CA, Inc.
    Inventors: Kumar Arnesh Rameshwar, Guru Prasadareddy Narapu Reddy
  • Patent number: 10331681
    Abstract: Implementations provide an improved system for presenting search results based on entity associations of the search items. An example method includes, for each of a plurality of crowdsource workers, initiating display of a first randomly selected cluster set from a plurality of cluster sets to the crowdsource worker. Each cluster set represents a different clustering algorithm applied to a set of search items responsive to a query. The method also includes receiving cluster ratings for the first cluster set from the crowdsource worker and calculating a cluster set score for the first cluster set based on the cluster ratings. This is repeated for remaining cluster sets in the plurality of cluster sets. The method also includes storing a cluster set definition for a highest scoring cluster set, associating the cluster set definition with the query, and using the definition to display search items responsive to the query.
    Type: Grant
    Filed: April 11, 2016
    Date of Patent: June 25, 2019
    Assignee: GOOGLE LLC
    Inventors: Jilin Chen, Amy Xian Zhang, Sagar Jain, Lichan Hong, Ed Huai-Hsin Chi
  • Patent number: 10325026
    Abstract: A technique for generating a new equivalent phrase for an input phrase includes receiving a first input phrase for natural language expansion. Tokens that correspond to parts of speech are generated for the first input phrase. An original grammar tree is generated using at least some of the tokens. One or more alternate grammar trees are generated that are different from the original grammar tree but substantially equivalent to the original grammar tree. One or more synonyms for at least one of the tokens are generated. Finally, one or more new phrases are generated based on the one or more alternate grammar trees and the one or more synonyms.
    Type: Grant
    Filed: September 25, 2015
    Date of Patent: June 18, 2019
    Assignee: International Business Machines Corporation
    Inventor: Bryan D. Cardillo
  • Patent number: 10324971
    Abstract: A method for classifying a new instance including a text document by using training instances with class including labeled data and zero or more training instances with class including unlabeled data, comprising: estimating a word distribution for each class by using the labeled data and the unlabeled data; estimating a background distribution and a degree of interpolation between the background distribution and the word distribution by using the labeled data and the unlabeled data; calculating two probabilities for that the word generated from the word distribution and the word generated from the background distribution; combining the two probabilities by using the interpolation; combining the resulting probabilities of all words to estimate a document probability for the class that indicates the document is generated from the class; and classifying the new instance as a class for which the document probability is the highest.
    Type: Grant
    Filed: June 20, 2014
    Date of Patent: June 18, 2019
    Assignee: NEC Corporation
    Inventors: Daniel Georg Andrade Silva, Hironori Mizuguchi, Kai Ishikawa
  • Patent number: 10289667
    Abstract: Computer-program products and methods for automatically annotating terms, such as ambiguous terms, in an electronic text document are disclosed. In one embodiment, a method of annotating a text document includes determining, by a computing device, a term of interest within the text document. The method further includes searching a data structure including incongruous term pairs (tx, tt) determined from a controlled vocabulary for the term of interest appearing as a term tt, wherein the term tt is a linguistic head of a term tx of the incongruous term pairs (tx, tt). The method further includes annotating the term of interest with a meaning provided by the controlled vocabulary only if a term tx of the incongruous term pairs (tx, tt) associated with the term of interest in the data structure is not present within a predetermined textual distance of the term of interest in the text document.
    Type: Grant
    Filed: September 6, 2016
    Date of Patent: May 14, 2019
    Assignee: Elsevier B.V.
    Inventors: Marius Doornenbal, Inga Kohlhof
  • Patent number: 10275459
    Abstract: A content management system (CMS) and a translation management system (TMS) can utilize content dimensions for content items to manage and translate the content items between languages. Machine and human translations of complex dynamic content can also be improved by pre-rendering the content to remove localization-related syntax prior to machine or human translation. Content items can also be scored as to their suitability for localization prior to translation, and translation can be skipped for content items that do not have a sufficiently high score. Semantic and natural language processing (NLP) techniques can also be utilized for content categorization and routing. Translations of content items can also be continuously refined and higher quality re-translated content can be provided in an automated fashion.
    Type: Grant
    Filed: September 28, 2016
    Date of Patent: April 30, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Paul Kasper, Pallami Bhattacharjee, Paul Christopher Cerda, William Joseph Kaper, Thibault Pierre Seillier, Kelly Duggar Wiggins
  • Patent number: 10275449
    Abstract: A computing device automatically creates a log record recognizer expression and uses the log record recognizer expression to identify a log record type for a log record to parse the log record. A log record type regular expression is selected from log record type regular expressions and is separated into subexpressions that are normalized and are reassembled into an expression recognizer for each log record type regular expression. The expression recognizer for each is read into a data structure. The recognizer expressions are sorted based on an order associated with an expression operator of each subexpression. A log recognizer expression is created from each read expression recognizer included in the sorted recognizer expressions. A log record type of a log record is identified using the created log recognizer expression. A log record type regular expression is selected. The log record is parsed using the selected log record type regular expression.
    Type: Grant
    Filed: October 3, 2018
    Date of Patent: April 30, 2019
    Assignee: SAS INSTITUTE INC.
    Inventors: Keefe Hayes, Robert N. Bonham