Based On Phrase, Clause, Or Idiom Patents (Class 704/4)

Method of distributed management of electronic documents of title (EDT) and system thereof

Patent number: 10635722

Abstract: There are provided decentralized system and method of managing electronic documents of title (EDTs).

Type: Grant

Filed: April 20, 2016

Date of Patent: April 28, 2020

Assignee: OGY DOCS, INC.

Inventors: Gad Ruschin, Or Garbash, Yair Sappir
Policy compliance verification using semantic distance and nearest neighbor search of labeled content

Patent number: 10637826

Abstract: An online system determines whether a test content item violates a policy of the online system. The online system extracts a semantic from the test content item and determines a distance between the extracted semantic vector and the stored semantic vectors for content items that have been labeled to indicate whether they violate a policy. Using a nearest neighbor search, the online system selects a set of the stored semantic vectors and assigns a weight to the selected semantic vectors that is inversely related to the distances. The online system then determines whether the test content item violates a policy using a weighed voting scheme, where the labels of the stored semantic vectors are aggregated based on their associated weights. The online system may first attempt to match the test content with known bad content and terminate the more complex nearest neighbor search if such a match is found.

Type: Grant

Filed: August 6, 2018

Date of Patent: April 28, 2020

Assignee: Facebook, Inc.

Inventors: Enming Luo, Emanuel Alexandre Strauss
Protected search index

Patent number: 10615965

Abstract: A messaging service provides a search mechanism that utilizes a protected index. The protected index is generated by converting documents maintained by the messaging service into a set of tokens or words. Each token is converted to a corresponding value using a transformation such as a cryptographic hash function. The values are placed into an index that allows the messaging service to efficiently identify a set of documents associated with each particular value. When a document search request is submitted to the messaging service, the messaging service uses the transformation to generate corresponding values for each term in the search request, and uses the index to identify sets of documents associated with the values corresponding to the search terms. The messaging service applies search logic associated with the search request to the identified sets of documents to produce a final set of documents satisfying search request.

Type: Grant

Filed: March 22, 2017

Date of Patent: April 7, 2020

Assignee: Amazon Technologies, Inc.

Inventor: Matthew E. Goldberg
Semantic merge of arguments

Patent number: 10614100

Abstract: A method comprising using at least one hardware processor for: receiving a topic under consideration (TUC) and a set of claims referring to the TUC; identifying semantic similarity relations between claims of the set of claims; clustering the claims into a plurality of claim clusters based on the identified semantic similarity relations, wherein said claim clusters represent semantically different claims of the set of claims; and generating a list of non-redundant claims comprising said semantically different claims.

Type: Grant

Filed: April 29, 2015

Date of Patent: April 7, 2020

Assignee: International Business Machines Corporation

Inventors: Mitesh Khapra, Vikas Raykar, Amrita Saha, Noam Slonim, Ashish Verma
Utilizing discourse structure of noisy user-generated content for chatbot learning

Patent number: 10599885

Abstract: Systems, devices, and methods of the present invention uses noisy-robust discourse trees to determine a rhetorical relationship between one or more sentences. In an example, a rhetoric classification application creates a noisy-robust communicative discourse tree. The application accesses a document that includes a first sentence, a second sentence, a third sentence, and a fourth sentence. The application identifies that syntactic parse trees cannot be generated for the first sentence and the second sentence. The application further creates a first communicative discourse tree from the second, third, and fourth sentences and a second communicative discourse tree from the first, third, and fourth sentences. The application aligns the first communicative discourse tree and the second communicative discourse tree and removes any elementary discourse units not corresponding to a relationship that is in common between the first and second communicative discourse trees.

Type: Grant

Filed: June 15, 2018

Date of Patent: March 24, 2020

Assignee: Oracle International Corporation

Inventor: Boris Galitsky
Method and system of determining categories associated with keywords using a trained model

Patent number: 10599731

Abstract: Described is a technique for associating words used in a search query with categories. This technique aims to produce potentially more relevant search results by improving the associations with words used for a search. A machine learning technique is implemented to train a classification model, which may include a word embedding model. The classification model is trained to receive words as input and to create vectors of the words as output. These word vectors may then be mapped to a vector space and the technique may then perform a cluster analysis of the vectors. Based on the cluster analysis, clusters may be identified and each cluster may be associated with a corresponding category.

Type: Grant

Filed: April 26, 2016

Date of Patent: March 24, 2020

Assignee: BAIDU USA LLC

Inventors: Yu Zhu, Lin Li
Point in time expression of emotion data gathered from a chat session

Patent number: 10594638

Abstract: An electronic chat session monitoring device intercepts a text message from an electronic chat session. The text message is generated by a sender and addressed to an addressee. The electronic chat session monitoring device receives a current photo of the sender of the text message electronic chat session, which is taken contemporaneously with a generation of the text message by the sender and depicts an emotion of the sender while generating the text message. The electronic chat session monitoring device then transmits both the text message and the current photo of the sender to the addressee.

Type: Grant

Filed: February 13, 2015

Date of Patent: March 17, 2020

Assignee: International Business Machines Corporation

Inventors: James E. Bostick, John M. Ganci, Jr., Sarbajit K. Rakshit, Kimberly G. Starks
Systems and methods for automatic detection of idiomatic expressions in written responses

Patent number: 10585985

Abstract: Methods and systems for scoring written text based on use of idiomatic expressions, including reading pre-selected idiomatic expressions in a canonical form into memory, expanding idiomatic expressions from the canonical form, reading a written response into the memory, pre-processing the written response, searching the pre-processed written response for idiomatic expressions, and assigning a score to the written response. The score may be based at least in part on the number of idiomatic expressions in the written response. Corresponding apparatuses, systems, and methods are also disclosed.

Type: Grant

Filed: December 14, 2017

Date of Patent: March 10, 2020

Assignee: Educational Testing Service

Inventors: Michael Flor, Beata Beigman Klebanov
Validating the tone of an electronic communication based on recipients

Patent number: 10574605

Abstract: A mechanism is provided for validating the tone of an electronic communication being composed based on the recipients of the electronic communication. At least one tone of the electronic communication being composed by a sender and an identity of each of one or more recipients to whom the electronic communication is to be sent and the sender are identified. One or more previous electronic communications sent to or received from one or more of the one or more recipients and at least one tone of each of the one or more previous electronic communications are identified in order to generate one or more preferred tones. The tone of the electronic communication being composed is compared to the one or more preferred tones. Responsive to identifying a tone discrepancy between the electronic communication being composed and the one or more preferred tones, a notification is presented to the sender.

Type: Grant

Filed: May 18, 2016

Date of Patent: February 25, 2020

Assignee: International Business Machines Corporation

Inventors: Florian Pinel, Edward E. Seabolt
Validating an attachment of an electronic communication based on recipients

Patent number: 10574607

Abstract: A mechanism is provided for validating an attachment to an electronic communication being composed based on the recipients of the electronic communication. An associated tone or theme of the at least one attachment to the electronic communication being composed by a sender and an identity of each of one or more recipients to whom the electronic communication is to be sent and the sender are identified. One or more previous electronic communications sent to or received from one or more of the one or more recipients and at least one tone of each of the one or more previous electronic communications are identified in order to generate one or more preferred tones. Responsive to identifying a tone discrepancy between the tone or theme of the at least one attachment and the one or more preferred tones, a notification is presented to the sender.

Type: Grant

Filed: May 18, 2016

Date of Patent: February 25, 2020

Assignee: International Business Machines Corporation

Inventors: Florian Pinel, Edward E. Seabolt
Image forming apparatus with an improved capability to edited selectable detected areas

Patent number: 10554863

Abstract: An image forming apparatus includes a display device, an input device configured to receive a user operation, a preview processing unit configured to display as preview on the display device a document image as an output target, and an output processing unit configured to output the document image. Further, the preview processing unit (a) detects an area surrounded by a predetermined specific color in the document image and displays the document image as preview on the display device so as to make the detected area editable, and (b) add a text specified by the user operation in the detected area in accordance with the user operation. The output processing unit outputs the document image in which the text was added.

Type: Grant

Filed: May 7, 2018

Date of Patent: February 4, 2020

Assignee: Kyocera Document Solutions, Inc.

Inventor: Koji Tagaki
Creation of component templates

Patent number: 10552107

Abstract: In one example of the disclosure, a set of electronic document templates is accessed and instances of duplicated document content are identified. Display of a user notice for first duplicated document content is caused. Responsive to receipt of data indicative of a user instruction to create a component template for the first duplicated content, the component template is created and stored.

Type: Grant

Filed: December 2, 2015

Date of Patent: February 4, 2020

Assignee: OPEN TEXT CORPORATION

Inventors: James Matthew Downs, Billy R. Kidwell, Anthony Wiley
Hierarchical speech recognition decoder

Patent number: 10482876

Abstract: A speech interpretation module interprets the audio of user utterances as sequences of words. To do so, the speech interpretation module parameterizes a literal corpus of expressions by identifying portions of the expressions that correspond to known concepts, and generates a parameterized statistical model from the resulting parameterized corpus. When speech is received the speech interpretation module uses a hierarchical speech recognition decoder that uses both the parameterized statistical model and language sub-models that specify how to recognize a sequence of words. The separation of the language sub-models from the statistical model beneficially reduces the size of the literal corpus needed for training, reduces the size of the resulting model, provides more fine-grained interpretation of concepts, and improves computational efficiency by allowing run-time incorporation of the language sub-models.

Type: Grant

Filed: October 1, 2018

Date of Patent: November 19, 2019

Assignee: Interactions LLC

Inventors: Ethan Selfridge, Michael Johnston
Intention inference system and intention inference method

Patent number: 10460034

Abstract: An intention inference system includes, a morphological analyzer to perform morphological analysis for a complex sentence with multiple intentions involved, a syntactic analyzer to perform syntactic analysis for the complex sentence morphologically analyzed by the morphological analyzer and to divide it into the first simple sentence and the second simple sentence, an intention inference unit to infer the first intention involved in the first simple sentence and the second intention involved in the second simple sentence, a feature extractor to extract as the first feature a morpheme showing execution order of operations involved in the first simple sentence and to extract as the second feature a morpheme showing execution order of operations involved in the second simple sentence, and an execution order inference unit to infer the execution order of the first operation corresponding to the first intention and the second operation corresponding to the second intention on the basis of the first feature and the

Type: Grant

Filed: January 28, 2015

Date of Patent: October 29, 2019

Assignee: Mitsubishi Electric Corporation

Inventors: Yi Jing, Yusuke Koji, Jun Ishii
Unstructured data analytics systems and methods

Patent number: 10452698

Abstract: An unstructured data analytics system, including: an unstructured data analytics algorithm resident on a server and accessible via a browser operable for receiving unstructured data from one or more remote sources, applying one or more analytical tools to the unstructured data, and displaying summary information to one or more users; wherein the summary information is displayed to the one or more users in a presentation layer, an exploration layer, and an annotation layer. The unstructured data analytics algorithm is also operable for receiving outside data from one or more remote sources. The presentation layer displays one or more of the unstructured data a summary of the unstructured data, and the summary information. The exploration layers allows the one or more users to modify the granularity of the summary information, thereby modifying the granularity of the presentation layer. The one or more users can interact with the unstructured data analytics system simultaneously via the annotation layer.

Type: Grant

Filed: May 11, 2016

Date of Patent: October 22, 2019

Assignee: Stratifyd, Inc.

Inventor: Xiaoyu Wang
Ambiguity resolving conversational understanding system

Patent number: 10446137

Abstract: Systems, components, devices, and methods for resolving ambiguity in a conversational understanding system are provided. A non-limiting example is a system or method for resolving ambiguity in a conversational understanding system. The method includes the steps of receiving a natural language input and identifying an agent action based on the natural language input. The method also includes the steps of determining an ambiguity value associated with the agent action and evaluating the ambiguity value against an ambiguity condition. The method includes the steps of when determined that the ambiguity value meets the ambiguity condition: selecting a prompting action based on the ambiguity associated with the identified agent action, performing the prompting action, receiving additional input in response to the prompting action, and updating the agent action to resolve the ambiguity based on the additional input. The method also includes the step of performing the agent action.

Type: Grant

Filed: October 19, 2016

Date of Patent: October 15, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Omar Zia Khan, Ruhi Sarikaya, Divya Jetley
Developing contextual information from an image

Patent number: 10444894

Abstract: In an example implementation according to aspects of the present disclosure, a method may include capturing data entered on a touch sensitive mat or on an object physically disposed on the touch sensitive mat. The method further includes extracting the data from the captured image, and developing contextual information from the data extracted from the captured image. The method further includes projecting the contextual information onto the touch sensitive mat or onto the object.

Type: Grant

Filed: September 12, 2014

Date of Patent: October 15, 2019

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Immanuel Amo, Diogo Lima, Nicholas P Lyons, Arman Alimian
Computer implemented and computer controlled method, computer program product and platform for arranging data for processing and storage at a data storage engine

Patent number: 10437872

Abstract: A computer implemented and computer controlled method of arranging data for processing and storage thereof at a data storage engine. To identified data elements, an action is assigned from a plurality of actions as well as an association between data elements of an action according to a respective topology comprised of an ordered plurality of data categories including a subject data category, an object data category, a spatial data category and a temporal data category. By matching the identified data elements with action topology combinations and using the order of the data elements, one data element is matched with one data category. Instance information is supplemented to matched action topology combinations. In a computer readable format, at a data storage engine, identified data elements, instance information and associations between identifiers resulting from identifying, assigning, matching and supplementing are stored.

Type: Grant

Filed: May 26, 2017

Date of Patent: October 8, 2019

Assignee: DYNACTIONIZE N.V.

Inventor: Michael Rik Frans Brands
Determination method and determination apparatus

Patent number: 10437932

Abstract: A determination method executed by a computer including a memory and a processor coupled to the memory, includes receiving a plurality of sentences and designation of terms included in the plurality of sentences, generating, for each term for which the designation is received, information indicating a relation between the term and each of other terms included in one of the plurality of sentences containing the term, extracting, for each term for which the designation is received, information indicating a specific relation from the generated information indicating the relation, generating characteristic information that uses the extracted information as a feature, and determining similarity between a plurality of the terms based on the generated characteristic information for each term.

Type: Grant

Filed: March 19, 2018

Date of Patent: October 8, 2019

Assignee: FUJITSU LIMITED

Inventors: Kazuo Mineno, Nobuko Takase, Naohiro Itou
Method and system for generating a conversational agent by automatic paraphrase generation based on machine translation

Patent number: 10423665

Abstract: The present teaching relates to generating a conversational agent. In one example, a plurality of input utterances may be received from a developer. A paraphrase model is obtained. The paraphrase model is generated based on machine translation. For each of the plurality of input utterances, one or more paraphrases of the input utterance are generated based on the paraphrase model. For each of the plurality of input utterances, at least one of the one or more paraphrases is selected based on an instruction from the developer to generate selected paraphrases. The conversational agent is generated based on the plurality of input utterances and the selected paraphrases.

Type: Grant

Filed: August 2, 2017

Date of Patent: September 24, 2019

Assignee: Oath Inc.

Inventors: Ankur Gupta, Timothy Daly, Tularam Ban
Generating multilingual queries

Patent number: 10423615

Abstract: The method includes monitoring a computing device for language settings during user-generated content creation and detect one or more language settings. The method further includes analyzing user-created content to detect a language from a text of the user-generated content. The method further includes compiling a list of scored preferred languages for the computing device based on the detected language settings and the detected language of the text. The method further includes intercepting a query from the computing device. The method further includes analyzing a text of the intercepted query in a plurality of selected languages based on a language setting of a user interface application, a detected language of the query, and a predetermined number of preferred languages of the computing device to produce results of analysis for each selected language. The method further includes generating a multilingual query based on the results of analysis for the selected languages.

Type: Grant

Filed: June 10, 2016

Date of Patent: September 24, 2019

Assignee: International Business Machines Corporation

Inventors: Leonid Bolshinsky, Vladimir Gamaley, Sharon Krisher
Systems and methods for verbatim-text mining

Patent number: 10417269

Abstract: A system and method for verbatim-text mining including parsing documents of a text corpus into a plurality of individual sentences, assigning a sentence identifier to one or more individual sentences of the plurality of individual sentences, generating a plurality of n-Gram strings comprising a plurality of n-Grams from words within the individual sentence, applying an inverted index to the n-Gram string, combining an index data structure of one n-Gram string with an index data structure of another n-Gram string forming a merged index data structure when the index data structure of one n-Gram string shares a predetermined percentage of sentence identifiers of the index data structure of another n-Gram string, assigning a group identifier to the merged index data structure of a one or more merged index data structures, and creating a data set comprising the sentence identifier, the group identifier and the associated n-Gram string.

Type: Grant

Filed: March 13, 2017

Date of Patent: September 17, 2019

Assignee: LexisNexis, a division of Reed Elsevier Inc.

Inventor: Paul Zhang
Language translation for display device

Patent number: 10409919

Abstract: A display method includes reading from a memory a language setting representing an original language and a first target language; detecting a first set of one or more characters input in the original language; recognizing the first set of one or more characters as first text; translating the first text from the original language to the first target language; displaying the translated first text on one or more display areas; translating the translated first text back to the original language; and displaying the first text translated back to the original language on the one or more display areas.

Type: Grant

Filed: September 28, 2015

Date of Patent: September 10, 2019

Assignee: Konica Minolta Laboratory U.S.A., Inc.

Inventors: Howard Rubin, Isao Hayami
Generating multilingual queries

Patent number: 10409810

Abstract: The method includes monitoring a computing device for language settings during user-generated content creation and detect one or more language settings. The method further includes analyzing user-created content to detect a language from a text of the user-generated content. The method further includes compiling a list of scored preferred languages for the computing device based on the detected language settings and the detected language of the text. The method further includes intercepting a query from the computing device. The method further includes analyzing a text of the intercepted query in a plurality of selected languages based on a language setting of a user interface application, a detected language of the query, and a predetermined number of preferred languages of the computing device to produce results of analysis for each selected language. The method further includes generating a multilingual query based on the results of analysis for the selected languages.

Type: Grant

Filed: May 8, 2015

Date of Patent: September 10, 2019

Assignee: International Business Machines Corporation

Inventors: Leonid Bolshinsky, Vladimir Gamaley, Sharon Krisher
Unsupervised neural based hybrid model for sentiment analysis of web/mobile application using public data sources

Patent number: 10394959

Abstract: Machine training for determining sentiments in social network communications. A text document is extracted from a web site and tokenized into tokens. The tokens are input to a word to vector conversion model to generate word vectors. A term frequency inverse document frequency (TF-IDF) algorithm converts the word vectors to sentence vectors. A randomly selected subset the sentence vectors are tagged and used to train a classifier. The classifier takes a sentence vector and predicts a sentiment associated with the sentence vector. Predicted sentiment associated with each of the sentence vectors may be combined to generate a sentiment associated with the text document.

Type: Grant

Filed: December 21, 2017

Date of Patent: August 27, 2019

Assignee: International Business Machines Corporation

Inventors: Ankur Tagra, Rajat Verma, Sudarshan Narayanan
Paraphrasing text in a webpage

Patent number: 10387529

Abstract: The present invention may be a method, a system, and/or a computer program product. An embodiment of the present invention provides a method for paraphrasing, on a client computer, text in a webpage, the method comprising the following: transferring a request for a webpage including a plurality of passages of text to a server; receiving the webpage from the server in response to the request; judging whether or not the received webpage has text which is a subject of paraphrase; in a case where the judgment is positive, paraphrasing the text; and displaying, on a display, the webpage including the paraphrased text. Another embodiment of the present invention provides a method for updating on a server, text in a webpage, the method comprising the following: receiving, from each of the devices, a set of URLs of a webpage, a location path of text which is a subject of paraphrase in the webpage, and paraphrased text; and replacing text in the webpage with text among the received text.

Type: Grant

Filed: February 16, 2017

Date of Patent: August 20, 2019

Assignee: International Businesss Machines Corporation

Inventors: Kentaroh Noji, Akihiko Takajo, Yukiko Yasuda
System and method for automatic content aggregation evaluation

Patent number: 10380126

Abstract: Systems and methods for content aggregation creation are disclosed herein. The system can include memory having a content database and an aggregation database. The system can include a user device having a first network interface and a first I/O subsystem. The system can include a server that can: provide content to the user device via a first electrical signal; receive a selection of a portion of the provided content from the user device via a second electrical signal; automatically extract sentences from the selected portion of the provided content via a natural language processor; automatically generate a parse tree for one of the automatically extracted sentences; identify noun phrases from the part of speech tags within the parse tree; place content associated with one of the noun phrase in a content aggregation; and output the content aggregation to the user device.

Type: Grant

Filed: December 13, 2016

Date of Patent: August 13, 2019

Assignee: PEARSON EDUCATION, INC.

Inventors: Sean York, Tim Stewart, David Strong, Scott Hellman, William Murray
DITA relationship table based on contextual taxonomy density

Patent number: 10372744

Abstract: A computer scans a DITA library to identify DITA topic files. The computer then determines whether the identified DITA file has a concept, task, or reference scheme. Based on determining that the identified DITA topic file has a concept scheme, the computer generates a subject taxonomy. Based on determining that the identified DITA topic file has a task scheme, the computer generates a navigation taxonomy. Based on determining that the identified DITA topic file has a reference scheme, the computer generates a command relational taxonomy. Based on the generated subject, navigation, and command relational taxonomies, the computer generates a DITA file relationship table based on the contextual taxonomy density of the aforementioned taxonomies.

Type: Grant

Filed: June 3, 2016

Date of Patent: August 6, 2019

Assignee: International Business Machines Corporation

Inventors: Balaji S. Kumar, Vishal G. Palliyathu, Harpreet Singh
Bilingual corpus update method, bilingual corpus update apparatus, and recording medium storing bilingual corpus update program

Patent number: 10354646

Abstract: A third sentence obtained by replacing a first phrase of a first sentence with a second phrase is input, and it is judged whether a third phrase is included in a first database including at least a phrase used in written text. If the third phrase is not included, a first evaluation value in the first database is calculated for a seventh phrase obtained by replacing the second phrase of the third phrase with a sixth phrase. It is judged whether the third phrase is included in a second database including at least a phrase used in spoken text and whether a second evaluation value calculated from the first evaluation value satisfies a predetermined condition. If the third phrase is included, and the second evaluation value satisfies the predetermined condition, the third sentence and the second sentence as a pair are added to a bilingual corpus.

Type: Grant

Filed: August 29, 2017

Date of Patent: July 16, 2019

Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventors: Nanami Fujiwara, Masaki Yamauchi, Masahiro Imade
Information processing system, an information processing method and a computer readable storage medium

Patent number: 10354010

Abstract: An information processing system to increase weights of words that are related to a text, but that do not explicitly occur in the text, in a weight vector representing the text, is provided. An adjusting system (100) includes a distance storing unit (110) and an adjusting unit (120). The distance storing unit (110) stores distances between any two terms of a plurality of terms. The distance between two terms becomes smaller as the two terms are semantically more similar. The adjusting unit (120) adjusts a weight of each term of the plurality of terms in a weight vector including weights of the plurality of terms and representing a text, on the basis of a distance between each term and other term in the weight vector and a weight of the other term.

Type: Grant

Filed: April 24, 2015

Date of Patent: July 16, 2019

Assignee: NEC CORPORATION

Inventors: Daniel Georg Andrade Silva, Akihiro Tamura, Masaaki Tsuchida
Automatically detecting internalization (i18n) issues in source code as part of static source code analysis

Patent number: 10339029

Abstract: A method of detecting potential internationalization issues in source code may include installing a plug-in component in a stand-alone static source code analysis program/application that is configured to enable detection of internationalization issues in source code. The method may also include automatically creating a repository comprising a plurality of internationalization rules for a plurality of programming languages that are provided by the plug-in and accessing a subset of the plurality of internationalization rules corresponding to a particular programming language of the plurality of programming languages. The method may include creating a quality profile for the particular programming language using the subset of the plurality of internationalization rules, scanning source code of a software product for potential issues on a block level, and identifying and displaying the detected internationalization issues in the source code.

Type: Grant

Filed: November 1, 2016

Date of Patent: July 2, 2019

Assignee: CA, Inc.

Inventors: Kumar Arnesh Rameshwar, Guru Prasadareddy Narapu Reddy
Crowdsourced evaluation and refinement of search clusters

Patent number: 10331681

Abstract: Implementations provide an improved system for presenting search results based on entity associations of the search items. An example method includes, for each of a plurality of crowdsource workers, initiating display of a first randomly selected cluster set from a plurality of cluster sets to the crowdsource worker. Each cluster set represents a different clustering algorithm applied to a set of search items responsive to a query. The method also includes receiving cluster ratings for the first cluster set from the crowdsource worker and calculating a cluster set score for the first cluster set based on the cluster ratings. This is repeated for remaining cluster sets in the plurality of cluster sets. The method also includes storing a cluster set definition for a highest scoring cluster set, associating the cluster set definition with the query, and using the definition to display search items responsive to the query.

Type: Grant

Filed: April 11, 2016

Date of Patent: June 25, 2019

Assignee: GOOGLE LLC

Inventors: Jilin Chen, Amy Xian Zhang, Sagar Jain, Lichan Hong, Ed Huai-Hsin Chi
Method for classifying a new instance

Patent number: 10324971

Abstract: A method for classifying a new instance including a text document by using training instances with class including labeled data and zero or more training instances with class including unlabeled data, comprising: estimating a word distribution for each class by using the labeled data and the unlabeled data; estimating a background distribution and a degree of interpolation between the background distribution and the word distribution by using the labeled data and the unlabeled data; calculating two probabilities for that the word generated from the word distribution and the word generated from the background distribution; combining the two probabilities by using the interpolation; combining the resulting probabilities of all words to estimate a document probability for the class that indicates the document is generated from the class; and classifying the new instance as a class for which the document probability is the highest.

Type: Grant

Filed: June 20, 2014

Date of Patent: June 18, 2019

Assignee: NEC Corporation

Inventors: Daniel Georg Andrade Silva, Hironori Mizuguchi, Kai Ishikawa
Recombination techniques for natural language generation

Patent number: 10325026

Abstract: A technique for generating a new equivalent phrase for an input phrase includes receiving a first input phrase for natural language expansion. Tokens that correspond to parts of speech are generated for the first input phrase. An original grammar tree is generated using at least some of the tokens. One or more alternate grammar trees are generated that are different from the original grammar tree but substantially equivalent to the original grammar tree. One or more synonyms for at least one of the tokens are generated. Finally, one or more new phrases are generated based on the one or more alternate grammar trees and the one or more synonyms.

Type: Grant

Filed: September 25, 2015

Date of Patent: June 18, 2019

Assignee: International Business Machines Corporation

Inventor: Bryan D. Cardillo
Computer-program products and methods for annotating ambiguous terms of electronic text documents

Patent number: 10289667

Abstract: Computer-program products and methods for automatically annotating terms, such as ambiguous terms, in an electronic text document are disclosed. In one embodiment, a method of annotating a text document includes determining, by a computing device, a term of interest within the text document. The method further includes searching a data structure including incongruous term pairs (tx, tt) determined from a controlled vocabulary for the term of interest appearing as a term tt, wherein the term tt is a linguistic head of a term tx of the incongruous term pairs (tx, tt). The method further includes annotating the term of interest with a meaning provided by the controlled vocabulary only if a term tx of the incongruous term pairs (tx, tt) associated with the term of interest in the data structure is not present within a predetermined textual distance of the term of interest in the text document.

Type: Grant

Filed: September 6, 2016

Date of Patent: May 14, 2019

Assignee: Elsevier B.V.

Inventors: Marius Doornenbal, Inga Kohlhof
Source language content scoring for localizability

Patent number: 10275459

Abstract: A content management system (CMS) and a translation management system (TMS) can utilize content dimensions for content items to manage and translate the content items between languages. Machine and human translations of complex dynamic content can also be improved by pre-rendering the content to remove localization-related syntax prior to machine or human translation. Content items can also be scored as to their suitability for localization prior to translation, and translation can be skipped for content items that do not have a sufficiently high score. Semantic and natural language processing (NLP) techniques can also be utilized for content categorization and routing. Translations of content items can also be continuously refined and higher quality re-translated content can be provided in an automated fashion.

Type: Grant

Filed: September 28, 2016

Date of Patent: April 30, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Paul Kasper, Pallami Bhattacharjee, Paul Christopher Cerda, William Joseph Kaper, Thibault Pierre Seillier, Kelly Duggar Wiggins
Identification and parsing of a log record in a merged log record stream

Patent number: 10275449

Abstract: A computing device automatically creates a log record recognizer expression and uses the log record recognizer expression to identify a log record type for a log record to parse the log record. A log record type regular expression is selected from log record type regular expressions and is separated into subexpressions that are normalized and are reassembled into an expression recognizer for each log record type regular expression. The expression recognizer for each is read into a data structure. The recognizer expressions are sorted based on an order associated with an expression operator of each subexpression. A log recognizer expression is created from each read expression recognizer included in the sorted recognizer expressions. A log record type of a log record is identified using the created log recognizer expression. A log record type regular expression is selected. The log record is parsed using the selected log record type regular expression.

Type: Grant

Filed: October 3, 2018

Date of Patent: April 30, 2019

Assignee: SAS INSTITUTE INC.

Inventors: Keefe Hayes, Robert N. Bonham
Systems and methods for providing unread content items from specified entities

Patent number: 10270723

Abstract: Systems, methods, and non-transitory computer-readable media can acquire a specified set of one or more entities associated with a user of a social networking system. A collection of content items provided by the specified set of one or more entities can be detected. One or more content items that are unread by the user can be identified out of the collection of content items. The one or more content items unread by the user can be sorted, in a chronological order, to produce a sorted set of one or more unread content items. An interface can be provided to the user for accessing the sorted set of one or more unread content items.

Type: Grant

Filed: March 23, 2015

Date of Patent: April 23, 2019

Assignee: Facebook, Inc.

Inventors: Gregory Matthew Marra, Michael Novati, Zhiqiu Kong
Automatically extracting profile feature attribute data from event data

Patent number: 10255300

Abstract: Automatically extracting profile feature attribute data from event data is disclosed, including: receiving a set of event data; receiving a feature associated with a profiling technique; determining that a plurality of events associated with a user in the set of event data corresponds to a first attribute corresponding to the feature, wherein the first attribute corresponds to a first bin having a first defined value; determining that the plurality of events associated with the user in the set of event data corresponds to a second attribute corresponding to the feature, wherein the second attribute corresponds to a second bin having a second defined value; and creating a user record corresponding to the user indicating presence in the first bin and the second bin.

Type: Grant

Filed: May 14, 2015

Date of Patent: April 9, 2019

Assignee: Google LLC

Inventors: Anant Deep Jhingran, Krishna Kumar Kesavan, Joy Aloysius Thomas, Jagdish Chand, Sridhar Rajagopalan
System and method for adaptive spell checking

Patent number: 10229108

Abstract: A system and method for adaptive spell checking and correction. The method includes tracking frequencies of historical replacement strings of characters, and providing a list of “n” number of the historical replacement strings of characters in response to a string of characters which were previously changed or are not recognized.

Type: Grant

Filed: January 14, 2016

Date of Patent: March 12, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: William K. Bodin, Gregory J. Boss, Rick A. Hamilton, II, John S. Langford
Systems, methods, apparatuses, and computer program products for truncated, encrypted searching of encrypted identifiers

Patent number: 10216940

Abstract: Methods, apparatuses, and computer program products are provided for truncated, encrypted searching of encrypted identifiers. A method may include receiving patient information associated with a plurality of patients and including a patient identifier of a sequence of characters for each of the plurality of patients. Methods may further include: extracting a first subset of the sequence of characters from each of the patient identifiers; encrypting the first subset of the sequence of characters from each of the patient identifiers to form a first truncated encrypted identifier for each of the plurality of patients; encrypting each of the patient identifiers to create an encrypted patient identifier for each of the plurality of patients; and storing the first truncated encrypted identifiers and the encrypted patient identifiers for each of the plurality of patients.

Type: Grant

Filed: March 27, 2015

Date of Patent: February 26, 2019

Assignee: CHANGE HEALTHCARE HOLDINGS, LLC

Inventor: Mike Zwinger
Determining hierarchical user interface controls during content playback

Patent number: 10198245

Abstract: Systems and methods are provided for determining user interface controls associated with hierarchical levels during media content playback. In some embodiments, instructions may be provided to display user interface controls associated with a level of media content, such as a chapter level. A change in the level of media content and/or the user interface controls associated with the level may be a response to a level change request. A level change request may be initiated by user input and/or may automatically be initiated by the current content playback position.

Type: Grant

Filed: May 9, 2014

Date of Patent: February 5, 2019

Assignee: Audible, Inc.

Inventors: Timothy Jaeger, Alison Mae Go
Method and apparatus for exploiting language skill information in automatic speech recognition

Patent number: 10186256

Abstract: Typical speech recognition systems usually use speaker-specific speech data to apply speaker adaptation to models and parameters associated with the speech recognition system. Given that speaker-specific speech data may not be available to the speech recognition system, information indicative of language skills is employed in adapting configurations of a speech recognition system. According to at least one example embodiment, a method and corresponding apparatus, for speech recognition comprise maintaining information indicative of language skills of users of the speech recognition system. A configuration of the speech recognition system for a user is determined based at least in part on corresponding information indicative of language skills of the user. Upon receiving speech data from the user, the configuration of the speech recognition system determined is employed in performing speech recognition.

Type: Grant

Filed: January 23, 2014

Date of Patent: January 22, 2019

Assignee: Nuance Communications, Inc.

Inventors: Weiying Li, Daniel Willett
Method and system of performing a translation

Patent number: 10180940

Abstract: A translation method is disclosed herein. The method includes determining a target object to be translated, the target object including a plurality of elements; dividing the target object to be translated according to a language correspondence relationship to obtain at least one element set; determining a weight value of a second object corresponding to each first object in each element set according to the language correspondence relationship; determining a comparison value associated with each element set according to the determined weight value and selecting an element set with the maximum comparison value; determining a second object with the maximum weight value corresponding to each first object in the selected element set according to the correspondence relationship, combining all the determined second objects to form a translation content of the target object.

Type: Grant

Filed: September 22, 2016

Date of Patent: January 15, 2019

Assignee: Alibaba Group Holding Limited

Inventors: Hongfei Jiang, Jun Lu, Weihua Luo, Feng Lin
Extracting veiled meaning in natural language content

Patent number: 10176166

Abstract: Mechanisms for identifying hidden meaning in a portion of natural language content are provided. A primary portion of natural language content is received and a secondary portion of natural language content is identified that references the natural language content. The secondary portion of natural language content is analyzed to identify indications of meaning directed to elements of the primary portion of natural language content. A probabilistic model is generated based on the secondary portion of natural language content modeling a probability of hidden meaning in the primary portion of natural language content. A hidden meaning statement data structure is generated for the primary portion of natural language content based on the probabilistic model.

Type: Grant

Filed: September 1, 2017

Date of Patent: January 8, 2019

Assignee: International Business Machines Corporation

Inventors: Donna K. Byron, Benjamin L. Johnson, Lakshminarayanan Krishnamurthy, Krishna Kummamuru, Timothy P. Winkler
Unsupervised ontology-based graph extraction from texts

Patent number: 10169454

Abstract: A method for extracting a relations graph uses an ontology graph in which nodes represent entity classes or concepts and edges represent properties of the classes. A property is associated with a constraint which defines a range of values that can be taken without incurring a cost. Input text in which entity and concept mentions are identified is received. An optimal set of alignments between a subgraph of the ontology graph and the identified mentions is identified by optimizing a function of constraint costs incurred by the alignments and a distance measure computed over the set of alignments. The relations graph is generated, based on the optimal set of alignments. The relations graph represents a linked set of relations instantiating a subgraph of the ontology. The relations graph can include relations involving implicit mentions corresponding to subgraph nodes that are not aligned to any of the concept or entity mentions.

Type: Grant

Filed: May 17, 2016

Date of Patent: January 1, 2019

Assignee: XEROX CORPORATION

Inventors: Salah Ait-Mokhtar, Vassilina Nikoulina
Personal dictionary

Patent number: 10169322

Abstract: A method includes receiving, at a processor, a request to construct a word entry of a word. The method further includes collecting, by the processor, a user profile. The method further includes selecting, by the processor, one or more definition databases according to the user profile. The method further includes retrieving, by the processor, definitions of the word from the definition databases. The method further includes ranking, by the processor, the definitions retrieved from the definition databases.

Type: Grant

Filed: May 3, 2016

Date of Patent: January 1, 2019

Assignee: Dinky Labs, LLC

Inventors: Alan Rulin Liu, Gina Inan Liu
Cognitive agent for capturing referential information during conversation muting

Patent number: 10146770

Abstract: A mechanism is provided in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to implement a cognitive system for capturing referential information. The cognitive system receives a first indication that a group text messaging conversation is in a muted state for a first user. The cognitive system detects a first use of a referential phrase in the group text messaging conversation during a first time period when the group text messaging conversation is in the muted state. The cognitive system receives a second indication that the group text messaging conversation is in a non-muted state. The cognitive system detects a second use of the referential phrase in the group text messaging conversation during a second time period when the group text messaging conversation is in the non-muted state. The second time period is subsequent to the first time period.

Type: Grant

Filed: December 1, 2016

Date of Patent: December 4, 2018

Assignee: International Business Machines Corporation

Inventors: Robert H. Grant, Jeremy A. Greenberger, Trudy L. Hewitt, Joseph Lam, Francesco C. Schembari
Generating self-support metrics based on paralinguistic information

Patent number: 10147424

Abstract: The present disclosure includes techniques for selecting a response to an audio stream query. In one embodiment, an application server receives an audio stream query including content spoken by a user interacting with a voice-user interface. The application server determines a set of paralinguistic features from the audio stream query, and estimates at least a first attribute of the user based on the set of paralinguistic features. The application server identifies subject matter corresponding to the spoken content in the audio stream query, and determines two or more query responses corresponding to the identified subject matter. The application server then selects one of the query responses to present to the user based, at least in part, on the attribute of the user estimated from the set of paralinguistic features.

Type: Grant

Filed: October 26, 2016

Date of Patent: December 4, 2018

Assignee: INTUIT INC.

Inventors: Benjamin Indyk, Igor A. Podgorny, Raymond Chan
Method for text recognition and computer program product

Patent number: 10133965

Abstract: The invention refers to a method for text recognition, wherein the method is executed by a processor of a computing device and comprises steps of providing a confidence matrix, wherein the confidence matrix is a digital representation of an input sequence, entering a regular expression, searching for a symbol sequence of the input sequence that matches the regular expression, wherein a score value is computed by the processor using confidence values of the confidence matrix, wherein the score value is an indication of the quality of the matching between the symbol sequence of the input sequence and the regular expression. Further, the invention relates to a computer program product which when executed by a processor of a computing device performs the method.

Type: Grant

Filed: December 2, 2015

Date of Patent: November 20, 2018

Assignee: Planet A1 GbmH

Inventor: Welf Wustlich

prev 1 2 3 4 5 6 7 8 … next