Patents Assigned to POWERSET, INC.
  • Publication number: 20090138454
    Abstract: Technologies are described herein for generating a semantic translation rule to support natural language search. In one method, a first expression and a second expression are received. A first representation is generated based on the first expression, and a second representation is generated based on the second expression. Aligned pairs of a first term in the first representation and a second term in the second representation are determined. For each aligned pair, the first term and the second term are replaced with a variable associated with the aligned pair. Word facts that occur in both the first representation and the second representation are removed from the first representation and the second representation. The remaining word facts in the first representation are replaced with a broader representation of the word facts. The translation rule including the first representation, an operator, and the second semantic representation is generated.
    Type: Application
    Filed: August 29, 2008
    Publication date: May 28, 2009
    Applicant: POWERSET, INC.
    Inventors: Emmanuel Rayner, Richard Crouch, Hannah Copperman, Giovanni Lorenzo Thione, Martin Henk Van den Berg
  • Publication number: 20090132521
    Abstract: A role tree having nodes corresponding to semantic roles in a hierarchy is defined. A posting list is generated for each association of a term and a semantic role in the hierarchy. The posting lists are stored contiguously on a physical storage medium such that a subtree of the hierarchy of semantic roles can be loaded from the storage medium as a single contiguous block. The posting lists for a subtree of the hierarchy are retrieved by obtaining data identifying the beginning location on the physical storage medium of the posting lists for the term at the top of a desired subtree of the hierarchy and data identifying the length of the posting lists of the desired subtree of the hierarchy. A single contiguous block that includes the posting lists for the desired subtree of the hierarchy is then retrieved from the beginning location through the specified length.
    Type: Application
    Filed: August 29, 2008
    Publication date: May 21, 2009
    Applicant: POWERSET, INC.
    Inventors: Chad Walters, Giovanni Lorenzo Thione, Barney Pell, Lukas Biewald, Brendan O'Connor
  • Publication number: 20090094019
    Abstract: Word sense probabilities are compressed for storage in a semantic index. Each word sense for a word is mapped to one of a number of “buckets” by assigning a bucket score to the word sense. A scoring function is utilized to assign the bucket scores that maximizes the entropy of the assigned bucket scores. Once the bucket scores have been assigned to the word senses, the bucket scores are stored in the semantic index. The bucket scores stored in the semantic index may be utilized to prune one or more of the word senses prior to construction of the semantic index. The bucket scores may also be utilized to prune and rank the word senses at the time a query is performed using the semantic index.
    Type: Application
    Filed: August 29, 2008
    Publication date: April 9, 2009
    Applicant: POWERSET, INC.
    Inventors: Rion Snow, Giovanni Lorenzo Thione, Scott A. Waterman, Chad Walters, Timothy Converse
  • Publication number: 20090089047
    Abstract: Technologies are described herein for probabilistically assigning weights to word senses and hypernyms of a word. The weights can be used in natural language processing applications such as information indexing and querying. A word hypernym weight (WHW) score can be determined by summing word sense probabilities of word senses from which the hypernym is inherited. WHW scores can be used to prune away hypernyms prior to indexing, to rank query results, and for other functions related to information indexing and querying. A semantic search technique can use WHW scores to retrieve an entry related to a word from an index in response to matching an indexed hypernym of the word with a query term applied to the index. More refined and accurate query results may be provided based on reduced user inputs.
    Type: Application
    Filed: August 29, 2008
    Publication date: April 2, 2009
    Applicant: POWERSET, INC.
    Inventors: Barney Pell, Rion Snow, Scott A. Waterman
  • Publication number: 20090077069
    Abstract: Tools and techniques related to calculating valence of expressions within documents. These tools may provide methods that include receiving input documents for processing, and extracting expressions from the documents for valence analysis, with scope relationships occurring between terms contained in the expressions. The methods may calculate calculating valences of the expressions, based on the scope relationships between terms in the expressions.
    Type: Application
    Filed: August 29, 2008
    Publication date: March 19, 2009
    Applicant: POWERSET, INC.
    Inventors: Livia Polanyi, Martin Henk Van den Berg, Barney Pell
  • Publication number: 20090076799
    Abstract: Technologies are described herein for coreference resolution in an ambiguity-sensitive natural language processing system. Techniques for integrating reference resolution functionality into a natural language processing system can processes documents to be indexed within an information search and retrieval system. Ambiguity awareness features, as well as ambiguity resolution functionality, can operate in coordination with coreference resolution. Annotation of coreference entities, as well as ambiguous interpretations, can be supported by in-line markup within text content or by external entity maps. Information expressed within documents can be formally organized in terms of facts, or relationships between entities in the text. Expansion can support applying multiple aliases, or ambiguities, to an entity being indexed so that all of the possibly references or interpretations for that entity are captured into the index.
    Type: Application
    Filed: August 29, 2008
    Publication date: March 19, 2009
    Applicant: POWERSET, INC.
    Inventors: Richard Crouch, Martin Henk Van den Berg, Franco Salvetti, Giovanni Lorenzo Thione, David Ahn
  • Publication number: 20090070322
    Abstract: Computer-readable media and computer systems for conducting semantic processes to facilitate navigation of search results that include sets of tuples representing facts associated with content of documents in response to queries for information. Content of documents is accessed and semantic structures are derived by distilling linguistic representations from the content. Groups of two or more related words, called tuples, are extracted from the documents or the semantic structures. Tuples can be stored at a tuple index. Representations of the relational tuples are displayed in addition to documents retrieved in response to a query.
    Type: Application
    Filed: August 29, 2008
    Publication date: March 12, 2009
    Applicant: Powerset, Inc.
    Inventors: FRANCO SALVETTI, GIOVANNI LORENZO THIONE, RICHARD S. CROUCH, DAVID AHN, LUKAS A. BIEWALD, BRENDAN O'CONNOR, BARNEY D. PELL
  • Publication number: 20090070308
    Abstract: Tools and techniques are described herein for checkpointing iterators during search. These tools may provide methods that include instantiating iterators in response to a search request. The iterators include fixed state information that remains constant over a life of the iterator, and further include dynamic state information that is updated over the life of the iterator. The iterators traverse through postings lists in connection with performing the search request. As the iterators traverse the posting lists, the iterators may update their dynamic state information. The iterators may then evaluate whether to create checkpoints, with the checkpoints including representations of the dynamic state information.
    Type: Application
    Filed: August 29, 2008
    Publication date: March 12, 2009
    Applicant: POWERSET, INC.
    Inventors: Chad Walters, Lukas Biewald, Nitay Joffe, Andrew Alan James
  • Publication number: 20090070298
    Abstract: Tools and techniques are described that relate to iterators for applying term occurrence-level constraints in natural language searching. These tools may receive a natural language input query, and define term occurrence-level constraints applicable to the input query. The methods may also identify facts requested in the input query, and may instantiate an iterator to traverse a fact index to identify candidate facts responsive to the input query. This iterator may traverse through at least a portion of the fact index. The methods may receive candidate facts from this iterator, with these candidate facts including terms, referred to as term-level occurrences. The methods may apply the term occurrence-level constraints to the term-level occurrences. The methods may select the candidate fact for inclusion in search results for the input query, based at least in part on applying the term occurrence-level constraint.
    Type: Application
    Filed: August 29, 2008
    Publication date: March 12, 2009
    Applicant: POWERSET, INC.
    Inventors: Giovanni Lorenzo Thione, Barney Pell, Chad Walters, Richard Crouch
  • Publication number: 20090063426
    Abstract: Methods and computer-readable media for associating words or groups of words distilled from content, such as reported speech or an attitude report, of a document to form semantic relationships collectively used to generate a semantic representation of the content are provided. Semantic representations may include elements identified or parsed from a text portion of the content, the elements of which may be associated with other elements that share a semantic relationship, such as an agent, location, or topic relationship. Relationships may also be developed by associating one element that is in relation to, or is about, another element, thereby allowing for rapid and effective comparison of associations found in a semantic representation with associations derived from queries. The semantic relationships may be determined based on semantic information, such as potential meanings and grammatical functions of each element within the text portion of the content.
    Type: Application
    Filed: August 29, 2008
    Publication date: March 5, 2009
    Applicant: POWERSET, INC.
    Inventors: RICHARD S. CROUCH, MARTIN HENK VAN DEN BERG, DAVID AHN, OLGA GUREVICH, BARNEY D. PELL, LIVIA POLANYI, SCOTT A. PREVOST, GIOVANNI LORENZO THIONE
  • Publication number: 20090063472
    Abstract: Computer-readable media, computerized methods, and computer systems for conducting semantic processes to present search results that include highlighted regions which are relevant to a conceptual meaning of a query are provided. Initially, content of document(s) is accessed and semantic representations are derived by distilling linguistic representations from the content. These semantic representations may be stored at a semantic index. Also, a proposition is derived from the query by parsing search terms of the query, and distilling the proposition from the search terms. Typically, the proposition is a logical representation of the conceptual meaning of the query. The proposition is compared against the semantic representations at the semantic index to identify a matching set. Regions of the content within the document, from which the matching set of semantic representations are derived, are targeted.
    Type: Application
    Filed: August 29, 2008
    Publication date: March 5, 2009
    Applicant: Powerset, Inc., A Delaware Corporation
    Inventors: Barney Pell, Scott Prevost, Giovanni Lorenzo Thione, Brendan O'Connor, Lukas Biewald
  • Publication number: 20090063473
    Abstract: Methods, systems and computer readable media for finding documents in a data store that match a natural language query submitted by a user are provided. The documents and queries are matched by determining that words within the query have the same relationship to each other as the same words in the document. Documents are semantically analyzed and words in the document are indexed along with the role the word plays in a sentence. The initial semantic role may be generalized using a role hierarchy and stored in the index along with the original role. A similar analysis may be used with the search query to find words used in the same role in both the query and the document.
    Type: Application
    Filed: August 29, 2008
    Publication date: March 5, 2009
    Applicant: Powerset, Inc.
    Inventors: Martin HENK VAN DEN BERG, Richard S. CROUCH, Giovanni L. THIONE, Chad P. WALTERS
  • Publication number: 20090063550
    Abstract: Computer-readable media and a computer system for implementing a natural language search using fact-based structures and for generating such fact-based structures are provided. A fact-based structure is generated using a semantic structure, which represents information, such as text, from a document, such as a web page. Typically, a natural language parser is used to create a semantic structure of the information, and the parser identifies terms, as well as the relationship between the terms. A fact-based structure of a semantic structure allows for a linear structure of these terms and their relationships to be created, while also maintaining identifiers of the terms to convey the dependency of one fact-based structure on another fact-based structure. Additionally, synonyms and hypernyms are identified while generating the fact-based structure to improve the accuracy of the overall search.
    Type: Application
    Filed: August 29, 2008
    Publication date: March 5, 2009
    Applicant: POWERSET, INC.
    Inventors: MARTIN HENK VAN DEN BERG, DANIEL BOBROW, ROBERT D. CHESLOW, BARNEY D. PELL, GIOVANNI LORENZO THIONE, CHAD WATERS