Inverted Lists (epo) Patents (Class 707/E17.086)
  • Patent number: 10387568
    Abstract: An unsupervised keyword extraction process is disclosed. A single input document can be analyzed to identify multiple candidate keywords by utilizing splitting terms. A keyword score is calculated for each of the candidate keywords. The keyword score for a particular candidate keyword is determined based on the length of the candidate keywords that contain the candidate keyword and the frequency of the words appearing in the candidate keywords. One or more keywords having the highest keyword scores are selected as the extracted keywords. The extracted keywords can be used in applications, such as refining search results, providing suggested search terms, or improving the match rate of a network page at a search engine.
    Type: Grant
    Filed: September 19, 2016
    Date of Patent: August 20, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Weiwei Cheng, Amanda Dee Bottorff, Sandeep Ranganathan
  • Patent number: 9990442
    Abstract: Systems and methods for determining search results. The method may include receiving an at least partial search term, and identifying keywords based on the at least partial search term, wherein each keyword has an associated keyword measure based on the number of times each keyword has been previously searched for within a predetermined time period. For each keyword search results associated with the keyword may be identified, wherein each result has an associated search measure. A relevance measure may be determined for each result using the keyword measure the search measure, and used to provide at least one of the results as a search result for the at least partial search term.
    Type: Grant
    Filed: September 13, 2016
    Date of Patent: June 5, 2018
    Assignee: S.L.I. SYSTEMS, INC.
    Inventor: Shaun William Ryan
  • Publication number: 20130275436
    Abstract: Various embodiments promote the discoverability of data that can be contained within a database. In one or more embodiments, data within a database is organized in a structure having a schema. The structure and data can be processed in a manner that renders one or more pseudo-documents each of which constitutes a sub-structure that can be indexed. Once produced and indexed, the pseudo-documents constitute a set of searchable objects each of which relationally points back to its associated structure within the database. Searches can now be performed against the pseudo-documents which, in turn, returns a set of search results. The set of search results can include multiple sub-sets of pseudo-documents, each sub-set of which is associated with a different structure.
    Type: Application
    Filed: April 11, 2012
    Publication date: October 17, 2013
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Lev Novik, John C. Platt
  • Publication number: 20130097126
    Abstract: In response to a query having a search term, an inverted index that is defined on a set of attributes of a database structure is accessed, where the inverted index associates values of the set of attributes with corresponding references to rows of the database structure. It is determined whether any of the attributes in the set is in the search term. In response to determining that any of the attributes in the set is in the search term, the inverted index is used to produce an answer to the query.
    Type: Application
    Filed: October 17, 2011
    Publication date: April 18, 2013
    Inventor: D. Blair ELZINGA
  • Publication number: 20130007004
    Abstract: A tool for generating at least one search index for a composite document, wherein the composite document comprises multiple component documents. The search index is generated by extracting characters from the document, segregating the characters into tokens of one or more characters, and determining location information of the tokens. The location information can include the page number of the component document and X, Y page coordinates for the tokens. The tool also provides a user interface that allows for searching of the composite document using at least one of the generated indexes. The user interface allows the user to enter one or more search terms and to select the criteria that will be used during the search. Results are presented to the user via a list of document names that are also hyperlinks to the document. The results documents are listed in order of relevancy, and fragments of text that contain the searched terms are also available to the user, for each document.
    Type: Application
    Filed: June 30, 2011
    Publication date: January 3, 2013
    Applicant: Landon IP, Inc.
    Inventors: Krishmin RAI, George V. SHRECK
  • Publication number: 20120150867
    Abstract: Provided are techniques for creating an inverted index for features of a set of data elements, wherein each of the data elements is represented by a vector of features, wherein the inverted index, when queried with a feature, outputs one or more data elements containing the feature. The features of the set of data elements are ranked. For each feature in the ranked list, the inverted index is queried for data elements having the feature and not having any previously selected feature and a cluster of the data elements is created based on results returned in response to the query.
    Type: Application
    Filed: December 13, 2010
    Publication date: June 14, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Danish Contractor, Thomas Hampp-Bahnmueller, Sachindra Joshi, Raghuram Krishnapuram, Kenney Ng
  • Publication number: 20120109970
    Abstract: In response to a search query having a search term received from a client, a current language locale is determined. A state machine is built based on the current language locale, where the state machine includes one or more nodes to represent variance of the search term having identical meaning of the search term. Each node of the state machine is traversed to identify one or more postings lists of an inverted index corresponding to each node of the state machine. One or more item identifiers obtained from the one or more postings list are returned to the client, where the item identifiers identify one or more files that contain the variance of the search term represented by the state machine.
    Type: Application
    Filed: October 27, 2010
    Publication date: May 3, 2012
    Applicant: APPLE INC.
    Inventors: John M. Hörnkvist, Eric R. Koebler
  • Patent number: 8171029
    Abstract: In one embodiment, generating an ontology includes accessing an inverted index that comprises inverted index lists for words of a language. An inverted index list corresponding to a word indicates pages that include the word. A word pair comprises a first word and a second word. A first inverted index list and a second inverted index list are searched, where the first inverted index list corresponds to the first word and the second inverted index list corresponds to the second word. An affinity between the first word and the second word is calculated according to the first inverted index list and the second inverted index list. The affinity describes a quantitative relationship between the first word and the second word. The affinity is recorded in an affinity matrix, and the affinity matrix is reported.
    Type: Grant
    Filed: October 1, 2008
    Date of Patent: May 1, 2012
    Assignee: Fujitsu Limited
    Inventors: David L. Marvit, Jawahar Jain, Stergios Stergiou, Yannis Labrou
  • Publication number: 20110219008
    Abstract: A method and indexing system indexes the content of a body of documents into a content index, and the metadata of the documents into a metadata index which is a parallel index to the content index. The metadata is copied into a data store that is easily accessible by the indexing system and is stored in native form. The indexing system can dynamically re-index the metadata from the native metadata in the data store to produce a new metadata index which is used to replace the original metadata index. Search queries received by a search engine associated with the indexing system are applied to both the content and metadata index and the results are merged for return.
    Type: Application
    Filed: March 8, 2010
    Publication date: September 8, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: DAVID O. BEEN, MICHAEL BUSCH, OSAMU FURUSAWA, FREDERICK S. GRENNAN, FUMIHIKO TERUI, JUSTO L. PEREZ
  • Publication number: 20090119257
    Abstract: Techniques for searching a hierarchical database and an unstructured database with a single search query are described herein. In one embodiment, a single search query is received that has syntax identifying an unstructured search string within a structured search query to automatically cause a search of the inverted index and use of the result to automatically search the hierarchical database. The unstructured search string is extracted from the single search query and an inverted index is searched according to the unstructured search string, wherein the inverted index includes virtual documents created from data stored in the hierarchical database, wherein each virtual document includes a unique identifier from the hierarchical database used to designate the data in the hierarchical database from which that virtual document was created, wherein a result of the inverted index search includes the unique identifiers of the virtual documents that meet the search.
    Type: Application
    Filed: November 2, 2007
    Publication date: May 7, 2009
    Inventor: Christopher Waters
  • Publication number: 20090094262
    Abstract: In one embodiment, generating an ontology includes accessing an inverted index that comprises inverted index lists for words of a language. An inverted index list corresponding to a word indicates pages that include the word. A word pair comprises a first word and a second word. A first inverted index list and a second inverted index list are searched, where the first inverted index list corresponds to the first word and the second inverted index list corresponds to the second word. An affinity between the first word and the second word is calculated according to the first inverted index list and the second inverted index list. The affinity describes a quantitative relationship between the first word and the second word. The affinity is recorded in an affinity matrix, and the affinity matrix is reported.
    Type: Application
    Filed: October 1, 2008
    Publication date: April 9, 2009
    Applicant: Fujitsu Limited
    Inventors: David L. Marvit, Jawahar Jain, Stergios Stergiou, Yannis Labrou