Based On Term Frequency Of Appearance Patents (Class 707/750)
  • Patent number: 8185537
    Abstract: The present application discloses a method for monitoring abnormal state of Internet information. The method includes obtaining frequency data for current date common words appearing on the current date web pages, combining with a hot words dictionary that Internet users focuses on to determine a list of current date keywords related to the Internet information, determining a weight of each current date keyword, determining an abnormal threshold of the current date keywords, and detecting an abnormal level of the current date keywords to determine current date hot Internet information. The disclose method further calculates an abnormal level of keywords by monitoring the change in the hot words frequency in the Internet information, and generates warning for the abnormal level of hot words frequency change, which allows the Internet users to respond at the first moment.
    Type: Grant
    Filed: April 24, 2008
    Date of Patent: May 22, 2012
    Assignee: Peking University
    Inventors: Xun Liang, Hua Chen, Jian Yang
  • Patent number: 8185526
    Abstract: A content-based re-ranking (CBR) process may be performed on query results based on a selected keyword that is extracted from previous query results, and thereby increase a relevancy of search results. A search engine may perform the CBR process using a target image that is selected from a plurality of image search results, the CBR to identify re-ranked image search results. Keywords may be extracted from the re-ranked image search results. A portion of the keywords may be outputted as suggested keywords and made selectable by a user. Finally, a refined CBR process may be performed based on the target image and a received selection a suggested keyword, the refined CBR to output the refined image search results.
    Type: Grant
    Filed: January 21, 2010
    Date of Patent: May 22, 2012
    Assignee: Microsoft Corporation
    Inventors: Fang Wen, Jian Sun
  • Patent number: 8180772
    Abstract: An electronic data retrieving apparatus is provided that increases the retrieval accuracy without deteriorating the retrieval efficiency by reflecting differences between the numbers of word appearances due to genres of electronic data in the setting of the retrieval words. The electronic data retrieving apparatus according to the present invention sets the retrieval words of the electronic data not only as a word appearing on a retrieval word setting table of the recorded electronic data for a predetermined number of times (e.g., three times) or more but also a word appearing on the retrieval word setting table and appearing on a retrieval word setting reference table for a predetermined number of times (e.g., three times) or more.
    Type: Grant
    Filed: February 13, 2009
    Date of Patent: May 15, 2012
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Hiroshi Murakami, Yoshio Nishimoto
  • Patent number: 8180779
    Abstract: A computer system and method for validating data object classification and consolidation using external references. The external references may be web pages, product catalogs, external databases, URLs, search results provided by a search engine or subsets or combinations of any of these to validate a classification or consolidation of records. Embodiments validate a data object classification or consolidation decision by searching external data sources, such as databases, the Internet etc. for references to the transactional data object and determining a confidence level based on the original data object and the unstructured information reference, URL, or search result for example. Decisions may be verified or denied based on the comparison of the external references related to each data object. Embodiments of the invention save substantial labor in validating business data objects and make data more reliable across enterprise systems.
    Type: Grant
    Filed: December 30, 2005
    Date of Patent: May 15, 2012
    Assignee: SAP AG
    Inventors: Yoram Horowitz, Avi Malamud
  • Patent number: 8176032
    Abstract: Systems and methods are disclosed to automatically publish data items associated with a news event. In one example embodiment, a method comprises monitoring search queries associated with a search query category, detecting a change in a search request frequency associated with the search query category with respect to a baseline frequency, determining an event associated with the search query category, identifying one or more data items associated with the event, and generating a visual representation of a relationship between the one or more data items and the event. The search query category may be associated with at least one search term a baseline frequency.
    Type: Grant
    Filed: October 22, 2009
    Date of Patent: May 8, 2012
    Assignee: eBay Inc.
    Inventors: Dan Shen, Xiaodi Zhang, Qiang Wang, Helen Hang Ye, Jin Yu Lou
  • Publication number: 20120109978
    Abstract: Methods, systems, and apparatus, including computer program products, operable to perform operations including receiving through a user interface with an interface language a search query having query terms; using the interface language to select one or more mappings and using the selected mappings to simplify each query term; and applying each simplified query term to a synonyms map to identify possible synonyms with which to augment the search query. In alternative embodiments, the operations include generating a synonyms map from a corpus of documents; where the synonyms map maps each of multiple keys to one or more corresponding variants, where each variant is associated with one or more of document languages. In alternative embodiments, the operations include generating a synonyms map from documents by applying document language-dependent mappings to words in the documents to generate keys for the map.
    Type: Application
    Filed: January 12, 2012
    Publication date: May 3, 2012
    Applicant: GOOGLE INC.
    Inventor: Ruchira S. Datta
  • Publication number: 20120109976
    Abstract: The present invention relates to a method for assisting a user in making a decision to compare biometric data of an individual with data from a database relating to a large number of individuals, and biometric data is acquired for an individual concerned, that this data is encoded, that the data items are compared in pairs with corresponding data from the database, that, for each comparison score the duplicate occurrence frequency/non-duplicate occurrence frequency ration is established, that the product of all the available ratios is calculated, that this product is standardized, that the standardized ratio is compared to a pre-set threshold, that the values greater than the pre-set threshold are kept and that this result is submitted to the user for him to validate it as appropriate.
    Type: Application
    Filed: November 2, 2006
    Publication date: May 3, 2012
    Applicant: THALES
    Inventor: Jean Beaudet
  • Publication number: 20120109977
    Abstract: Example embodiments relate to keyword determination based on a weight of meaningfulness. In example embodiments, a computing device may determine a number of occurrences of a word in a particular document and may then determine a weight of meaningfulness for the word based on the number of occurrences. The computing device may then add the word to a set of keywords for the document based on the weight of meaningfulness.
    Type: Application
    Filed: November 2, 2010
    Publication date: May 3, 2012
    Inventors: Helen Balinsky, Alexander Balinsky, Steven J. Simske
  • Patent number: 8171026
    Abstract: The invention provides a document representation method and a document analysis method including extraction of important sentences from a given document and/or determination of similarity between two documents. The inventive method detects terms that occur in the input document, segments the input document into document segments, each segment being an appropriately sized chunk and generates document segment vectors, each vector including as its element values according to occurrence frequencies of the terms occurring in the document segments. The method further calculates eigenvalues and eigenvectors of a square sum matrix in which a rank of the respective document segment vector is represented by R and selects from the eigenvectors a plural (L) of eigenvectors to be used for determining the importance.
    Type: Grant
    Filed: April 16, 2009
    Date of Patent: May 1, 2012
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Takahiko Kawatani
  • Patent number: 8171043
    Abstract: Techniques are described to increase the diversity or focus of image search results. A user submits an original query to search for images. A server generates a first results set by executing the original query using metadata associated with each image. The server selects, from the first results set, a specified number of results ranked highest and generates a list of terms from the metadata of each of the results selected. The terms may be only the tags of the results. The server generates an updated query using terms in the list that may be weighted based on the frequency of the term in the list or include only a specified number of the highest occurring terms in the list. The server generates a second results set by executing the updated query using metadata associated with each image. The second results set is then stored and displayed to the user.
    Type: Grant
    Filed: October 24, 2008
    Date of Patent: May 1, 2012
    Assignee: Yahoo! Inc.
    Inventors: Vanessa Murdock, Roelof Van Zwol, Lluis Garcia Pueyo, Georgina Ramirez Camps
  • Patent number: 8171031
    Abstract: Technologies are described herein for providing a more efficient approach to ranking search results. An illustrative technology reduces an amount of ranking data analyzed at query time. In the technology, a term is selected, at index time, from a master index. The term corresponds to a number of documents greater than a threshold. A set of documents that includes the term is selected based on the master index. A rank is determined for each document in the set of documents that contains the term. Each document in the set of documents that contains the term is assigned to a top document list or a bottom document list based on the rank. Predefined values of at least part of the rank are stored in the top document list for documents in the top document list and are not stored in the bottom document list for documents in the bottom document list.
    Type: Grant
    Filed: January 19, 2010
    Date of Patent: May 1, 2012
    Assignee: Microsoft Corporation
    Inventors: Vladimir Tankovich, Dmitriy Meyerzon, Mihai Petriuc
  • Patent number: 8166050
    Abstract: A method includes processing a performance query to a dimensional data model by processing dimension coordinates that exist within the dimensional data model, wherein the dimension coordinates have a first particular grain (“finer grain”) that is finer than a second particular grain (“coarser grain”), the method to determine an evaluative score for a particular finer grain value based on performance facts for dimension coordinates associated with the particular finer grain value. Performance parameters are determined relative to a particular coarser grain value, against which to measure the performance facts associated with the finer grain value, including processing the temporal relationships of finer grain values to coarser grain values for the dimension coordinates. The evaluative score is determined for the particular finer grain value based on performance facts of dimension coordinates having the particular finer grain value, in view of the determined performance parameters.
    Type: Grant
    Filed: February 8, 2011
    Date of Patent: April 24, 2012
    Assignee: Merced Systems, Inc
    Inventor: Todd O. Dampier
  • Patent number: 8166051
    Abstract: An improved entropy-based term dominance metric useful for characterizing a corpus of text documents, and is useful for comparing the term dominance metrics of a first corpus of documents to a second corpus having a different number of documents.
    Type: Grant
    Filed: February 3, 2009
    Date of Patent: April 24, 2012
    Assignee: Sandia Corporation
    Inventors: Travis L. Bauer, Zachary O. Benz, Stephen J. Verzi
  • Patent number: 8166049
    Abstract: Keyword frequency data for a plurality of document-derived segments is represented in a matrix form in which each segment is represented as a vector of dimensionality equal to the number of keywords. The matrix may be subdivided into a plurality of sub-matrices, each preferably corresponding to a non-overlapping portion of the plurality of keywords. When determining a similarity measurement between any pair of segments, at least a portion of the keyword frequency data for each sub-matrix's non-overlapping keywords are used to determine a sub-matrix dot product for the pair of segments. The resulting plurality of sub-matrix dot products are then summed together in order to provide the similarity measurement. Keywords that are synonyms of each other may be accommodated through the modification of keyword frequency data. Where the keyword frequency data in the matrix representation is relative sparse, compressed views of the matrix representation may be provided.
    Type: Grant
    Filed: May 28, 2009
    Date of Patent: April 24, 2012
    Assignee: Accenture Global Services Limited
    Inventors: Jagadeesh Chandra Bose Rantham Prabhakara, Ashwin Nayak, Anitha Chandran
  • Patent number: 8165997
    Abstract: A method for classifying a previously unclassified posting that includes extracting a plurality of terms from the previously unclassified posting on an application forum, calculating a term answer probability and a term comment probability for each term of the plurality of terms. The term answer probability defines a probability that the term is in an answer posting assigned to an answer class, and the term comment probability defines a probability that the term is in a comment posting assigned to a comment class. The method further includes performing a Bayesian analysis using the term answer probability and the term comment probability for each term of the plurality of terms to select a posting class for the previously unclassified posting. The posting class is either the answer class or the comment class. The posting class is assigned to the previously unclassified posting.
    Type: Grant
    Filed: July 27, 2009
    Date of Patent: April 24, 2012
    Assignee: Intuit Inc.
    Inventors: Igor A. Podgorny, Howard Chen, Floyd J. Morgan, Amit Rohatgi
  • Patent number: 8161028
    Abstract: A system, method and computer program product provides a solution to a class of categorization problems using a semi-supervised clustering approach, the method employing performing a Soft Seeded k-means algorithm, which makes effective use of the side information provided by seeds with a wide range of confidence levels, even when they do not provide complete coverage of the pre-defined categories. The semi-supervised clustering is achieved through the introductions of a seed re-assignment penalty measure and model selection measure.
    Type: Grant
    Filed: December 5, 2008
    Date of Patent: April 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: Jianying Hu, Aleksandra Mojsilovic, Moninder Singh
  • Patent number: 8161041
    Abstract: One embodiment of the present invention provides a system that automatically generates synonyms for words from documents. During operation, this system determines co-occurrence frequencies for pairs of words in the documents. The system also determines closeness scores for pairs of words in the documents, wherein a closeness score indicates whether a pair of words are located so close to each other that the words are likely to occur in the same sentence or phrase. Finally, the system determines whether pairs of words are synonyms based on the determined co-occurrence frequencies and the determined closeness scores. While making this determination, the system can additionally consider correlations between words in a title or an anchor of a document and words in the document as well as word-form scores for pairs of words in the documents.
    Type: Grant
    Filed: February 10, 2011
    Date of Patent: April 17, 2012
    Assignee: Google Inc.
    Inventors: Oleksandr Grushetskyy, Steven D. Baker
  • Patent number: 8156128
    Abstract: In embodiments of the present invention improved capabilities are described for displaying mobile content in association with a website on a mobile communication facility based at least in part on receiving a website request from a mobile carrier gateway, receiving contextual information relating to the requested website, associating the received contextual information with a mobile content, and, finally, displaying the mobile content with the website on a mobile communication facility.
    Type: Grant
    Filed: June 12, 2009
    Date of Patent: April 10, 2012
    Assignee: Jumptap, Inc.
    Inventors: Jorey Ramer, Dennis Doughty, Adam Soroca
  • Patent number: 8150829
    Abstract: According to certain embodiments, facilitating display of terms includes facilitating display of a graphical user interface. One or more first input terms entered into a user entry window of the graphical user interface are received. One or more first output terms related to the first input terms are determined. Display of a first graphical cloud comprising the first output terms is facilitated. The first input terms are modified to yield one or more second input terms. One or more second output terms related to the second input terms are determined. Display of a second graphical cloud comprising the second output terms is facilitated.
    Type: Grant
    Filed: April 7, 2009
    Date of Patent: April 3, 2012
    Assignee: Fujitsu Limited
    Inventors: Yannis Labrou, Stergios Stergiou, David L. Marvit, Albert Reinhardt
  • Patent number: 8150860
    Abstract: One or more server devices may simultaneously calculate first ranking scores for a group of users and second ranking scores for a group of comments authored by the group of users. The calculating may occur during a same process. The one or more server devices may further provide one of a first ranked list that includes information identifying the group of users, the information identifying the group of users being ordered based on the first ranking scores, or a second group of comments of the group of comments, the comments in the second group of comments being ordered based on the second ranking scores.
    Type: Grant
    Filed: August 12, 2009
    Date of Patent: April 3, 2012
    Assignee: Google Inc.
    Inventors: Michal Cierniak, Na Tang
  • Patent number: 8145636
    Abstract: Systems, methods and program products for classifying text. A system classifies text into first subject matter categories. The system identifies one or more second subject matter categories in a collection of second subject matter categories, each of the second categories is a hierarchical classification of a collection of confirmed valid search results for queries, in which at least one query for each identified second category includes a term in the text. The system filters the identified categories by excluding identified categories whose ancestors are not among the first categories. The system selects categories from the filtered categories based on one or more thresholds in which a threshold specifies a degree of relatedness between a selected category and the text. The selected categories are a sufficient basis for recommending content to a user, the content being associated with one or more of the selected categories.
    Type: Grant
    Filed: March 13, 2009
    Date of Patent: March 27, 2012
    Assignee: Google Inc.
    Inventors: Glen M. Jeh, Beverly Yang
  • Patent number: 8145618
    Abstract: A system and method for scoring documents is described. One or more documents are identified responsive to a search criteria. A text match score indicating a quality of match of the identified documents is determined. A category match score is determined over categories. A document-categories score is determined indicating a quality of match between an identified document and a plurality of categories. A search criteria-categories score is determined indicating a quality of match between the search criteria and the categories. An overall score is determined based on the text match score and the category match score.
    Type: Grant
    Filed: October 11, 2010
    Date of Patent: March 27, 2012
    Assignee: Google Inc.
    Inventors: Karl Pfleger, Brian Larson
  • Patent number: 8145649
    Abstract: A system for selecting electronic advertisements from an advertisement pool to match the surrounding content is disclosed. To select advertisements, the system takes an approach to content match that takes advantage of machine translation technologies. The system of the present invention implements this goal by means of simple and efficient machine translation features that are extracted from the surrounding context to match with the pool of potential advertisements. Machine translation features used as features for training a machine learning model. In one embodiment, a ranking SVM (Support Vector Machines) trained to identify advertisements relevant to a particular context. The trained machine learning model can then be used to rank advertisements for a particular context by supplying the machine learning model with the machine translation features measures for the advertisements and the surrounding context.
    Type: Grant
    Filed: December 16, 2010
    Date of Patent: March 27, 2012
    Assignee: Yahoo! Inc.
    Inventors: Vanessa Murdock, Massimiliano Ciaramita, Vassilis Plachouras
  • Publication number: 20120072434
    Abstract: An information retrieval apparatus includes an acquiring unit that acquires a numerical value defining a boundary of a numerical range; a detecting unit that detects a number of places in and a head numeral of the numerical value; an extracting unit that extracts from a bit string group, a bit string indicating whether a numerical value in a numerical value group having the number of places and the head numeral is present in files subject to retrieval; a specifying unit that specifies a file corresponding to a bit in the extracted bit string, the bit indicating the presence of a numerical value of the numerical value group; a determining unit that determines whether a numerical value in the specified file meets the boundary condition; and a designating unit that, based on a determination by the determining unit designates the specified file to have a numerical value within the numerical range.
    Type: Application
    Filed: November 30, 2011
    Publication date: March 22, 2012
    Applicant: FUJITSU LIMITED
    Inventors: Masahiro Kataoka, Hiroyuki Torii, Masahiro Kurishima, Hideo Kasai
  • Patent number: 8140511
    Abstract: Embodiments of methods and apparatuses for searching contents, including structured search are described herein. Embodiments of the present invention use tree structures (or more generally, graph structures), layout structures, and/or content category information to capture within search results relevant content that would otherwise be missed, to reduce the incidence of false positives within search results, and to improve the accuracy of rankings within search results. Embodiments of the present invention further use tree structures (or more generally, graph structures), layout structures, and/or content category information to extend search results to include sub-document constituents. Embodiments of the present invention also support the use of distribution properties as criteria for ranking search results.
    Type: Grant
    Filed: April 10, 2009
    Date of Patent: March 20, 2012
    Assignee: Zalag Corporation
    Inventor: Samuel S. Epstein
  • Patent number: 8140526
    Abstract: A system is described for assessing information in natural language contents. A user interface receives an object name as a query term and a value for a customized ranking parameter from a user. A computer storage stores an object-specific data set related to the object name, wherein the object-specific data set includes a plurality of property names and association-strength values. A computer processing system can count a first frequency of a first property name and count a second frequency of a second property name in a document containing text in a natural language, calculate a relevance score as a function of the first frequency and the second frequency, and rank the plurality of documents using their respective relevance scores, and return one or more documents to the user based on the ranking of the plurality of documents. The function is in part defined by the customized ranking parameter.
    Type: Grant
    Filed: February 3, 2010
    Date of Patent: March 20, 2012
    Inventor: Guangsheng Zhang
  • Patent number: 8140514
    Abstract: A method of automatically classifying defects. The method generally includes the steps of (A) receiving information for a current defect, (B) extracting field values from the current defect, (C) counting a number of occurrences of one or more keywords in the current defect, (D) determining one or more new keywords occurring in the current defect and storing the one or more new keywords in a database and (E) creating one or more linkages in the database between a first record corresponding to the current defect and one or more second records corresponding to previous defects based upon one or more similarities between the first and the second records.
    Type: Grant
    Filed: November 26, 2008
    Date of Patent: March 20, 2012
    Assignee: LSI Corporation
    Inventors: Khanh Nguyen, Seonmi Anderson, Michael L. Peterson
  • Patent number: 8140337
    Abstract: Disclosed is an apparatus includes a text input device that inputs text data provided with confidence measure, as subject for mining, a language processing unit that performs language analysis of the input text data provided with the confidence measures, a confidence measure exploiting characteristic word count unit that counts the characteristic words in the input text to provide a count result and that exploits the statistical information and the confidence measures provided in the input text to correct the count result obtained, a characteristic measure calculation unit that calculates the characteristic measure of each characteristic word from the corrected count result, a mining result output device that outputs the characteristic measure of each characteristic word obtained, a user operation input device for a user to input setting for language processing of the input text and setting for a technique for calculating the characteristic measure being found, a mining process management unit that transmits
    Type: Grant
    Filed: July 18, 2007
    Date of Patent: March 20, 2012
    Assignee: NEC Corporation
    Inventors: Satoshi Nakazawa, Satoshi Morinaga
  • Patent number: 8135721
    Abstract: A system is described for discovering query intent based on search queries and concept networks. The system may construct frequency vectors from log data corresponding to a submitted query and at least one related query submitted to one or more search engines. The system may also construct a query intent vector based on the frequency vectors. The query intent vector may include frequency scores that represent the intent of the query.
    Type: Grant
    Filed: October 21, 2010
    Date of Patent: March 13, 2012
    Assignee: Yahoo! Inc.
    Inventors: Deepa B. Joshi, John J. Thrall
  • Patent number: 8135720
    Abstract: An apparatus for controlling devices for searching homology of queries in a base sequence in parallel, includes: a memory for storing a base sequence and an appearing frequency of each of first strings each having a fixed length appearing in the base sequence; and a processor for executing a process including: obtaining queries for searching homology in the base sequence; retrieving each of second strings each having a longer fixed length then that of first strings and partially appearing in each of the queries; determining an approximate appearing frequency of each of the second string on the basis of the appearing frequency of the first strings; evaluating for each of the query sequences a load of task for searching homology; and allocating each task for searching homology for each of the queries among the devices on the basis of the result of evaluation of the load of the each task.
    Type: Grant
    Filed: November 2, 2009
    Date of Patent: March 13, 2012
    Assignee: Fujitsu Limited
    Inventor: Akira Naruse
  • Patent number: 8131735
    Abstract: Methods and systems for rapid automatic keyword extraction for information retrieval and analysis. Embodiments can include parsing words in an individual document by delimiters, stop words, or both in order to identify candidate keywords. Word scores for each word within the candidate keywords are then calculated based on a function of co-occurrence degree, co-occurrence frequency, or both. Based on a function of the word scores for words within the candidate keyword, a keyword score is calculated for each of the candidate keywords. A portion of the candidate keywords are then extracted as keywords based, at least in part, on the candidate keywords having the highest keyword scores.
    Type: Grant
    Filed: September 9, 2009
    Date of Patent: March 6, 2012
    Assignee: Battelle Memorial Institute
    Inventors: Stuart J Rose, Wendy E Cowley, Vernon L Crow, Nicholas O Cramer
  • Patent number: 8131716
    Abstract: Determining a relevancy ranking score is disclosed. An indication is received that a relevancy ranking score algorithm is to be tuned to a selected preference. The relevancy ranking score algorithm is updated based at least in part on the selected preference, wherein the relevancy ranking score of a search result resulting from a search query is based at least in part on one or more constraints of the search query.
    Type: Grant
    Filed: July 12, 2010
    Date of Patent: March 6, 2012
    Assignee: EMC Corporation
    Inventors: Pierre-Yves Chevalier, Bruno Roustant
  • Publication number: 20120054206
    Abstract: A computer-implemented system and process for generating a relationship network is disclosed. The system provides a set of data items to be related and generates variable length data vectors to represent the relationships between the terms within each data item. The system can be used to generate a relationship network for documents, images, or any other type of file. This relationship network can then be queried to discover the relationships between terms within the set of data items.
    Type: Application
    Filed: July 25, 2011
    Publication date: March 1, 2012
    Applicant: The Regents of the University of California
    Inventors: Kasian Franks, Cornelia A. Myers, Raf M. Podowski
  • Patent number: 8117215
    Abstract: A query-centric system and process for distributing reverse indices for a distributed content system. Relevance ranking techniques in organizing distributed system indices. Query-centric configuration subprocesses (1) analyze query data, partitioning terms for reverse index server(s) (RIS), (2) distribute each partitioned data set by generally localizing search terms for the RIS that have some query-centric correlation, and (3) generate and maintain a map for the partitioned reverse index system terms by mapping the terms for the reverse index to a plurality of different index server nodes. Indexing subprocess element builds distributed reverse indices from content host indices. Routines of the query execution use the map derived in the configuration to more efficiently return more relevant search results to the searcher.
    Type: Grant
    Filed: September 24, 2010
    Date of Patent: February 14, 2012
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: George H. Forman, Zhichen Xu
  • Patent number: 8117214
    Abstract: The present invention provides a music artist retrieval system which makes it possible for users to automatically retrieve an unknown music artist similar to the user's favorite artist while actually reproducing and confirming a piece of music of the unknown artist. A music artist similarity map storing section (13) computes a plurality of similarities for a plurality of music artists and makes a music artist similarity map for the plurality of music artists based on the plurality of similarities, then stores the music artist similarity map. Here, the similarities are computed between one of the plurality of music artists and the other music artists based on features of the respective music artists. A similar artists selecting and displaying section (17) displays on a display plurality of indications related to one music artist and two or more music artists whose similarities are close to the one music artist, based on the music artist similarity map.
    Type: Grant
    Filed: October 5, 2007
    Date of Patent: February 14, 2012
    Assignee: National Institute of Advanced Industrial Science and Technology
    Inventors: Elias Pampalk, Masataka Goto
  • Patent number: 8108410
    Abstract: A mechanism for determining the veracity of data in a repository. Responsive to receiving a search query from a user, a semantic network is created from the documents in the repository. A determination is made as to whether data from a first document in the semantic network conflicts with data from a second document in the semantic network. If a conflict exists, a determination is made as to whether the data from the first document is obsolete in comparison to data from the second document. If the data from the first document is obsolete in comparison to data from the second document, a portion of the first document corresponding to the obsolete data is automatically annotating with the data from the second document to form an annotated first document. A search result list is then provided to the user comprising the second document and the annotated first document.
    Type: Grant
    Filed: August 6, 2008
    Date of Patent: January 31, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ann Margaret Strosaker, Michael Thomas Strosaker
  • Patent number: 8108406
    Abstract: Computer based systems, methods, software and databases are presented in which correlations between web item preferences, behaviors and pangenetic (genetic and epigenetic) attributes of individuals are used for pangenetic based user behavior prediction in which predictions of a user's online behavior can be generated based on the user's pangenetic makeup. Data masking can be used to maintain privacy of sensitive portions of the pangenetic data.
    Type: Grant
    Filed: December 30, 2008
    Date of Patent: January 31, 2012
    Assignee: Expanse Networks, Inc.
    Inventors: Andrew Alexander Kenedy, Charles Anthony Eldering
  • Patent number: 8108407
    Abstract: An information retrieval apparatus, which can present to a user only a related word matching a user search intent, includes: an associative dictionary storage unit for storing words included in plural pieces of text to be searched and relevance degrees between the words; an appearance frequency storage unit for storing an appearance frequency that is the number of pieces of text in which the words stored in the associative dictionary storage unit appear, among the plural pieces of text to be searched; and a related word obtaining unit that obtains a related word to be presented to the user, from the relevance degree between the search word entered by the user and another word among the words, the appearance frequency, and the user search intent.
    Type: Grant
    Filed: November 6, 2007
    Date of Patent: January 31, 2012
    Assignee: Panasonic Corporation
    Inventors: Takashi Tsuzuki, Kenji Mizutani, Kazutoyo Takata, Satoshi Matsuura
  • Patent number: 8108409
    Abstract: Embodiments of the present invention pertain to determining top combinations of items to present to a user. According to one embodiment, data that includes information describing a plurality of combinations of records is accessed. Each record describes a plurality of items. The data is analyzed using a branch and bound search procedure to determine top combinations of items based on a specified metric and a specified number. According to one embodiment, the metric is value enabled and the specified number determines how many combinations of items are associated with the top combinations of items.
    Type: Grant
    Filed: July 19, 2007
    Date of Patent: January 31, 2012
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Julie W. Drew, Juan Antonio R. Garay, Krishna Venkatraman
  • Patent number: 8103679
    Abstract: A system and method is described for receiving a plurality of non-standardized data sets and generating respective plurality of standardized profiles that can be used for efficiently comparing and matching one profile against the other plurality of profiles. One application of this invention is to convert job seekers' resumes and job postings into respective profiles and then permitting either a job seeker to search for job postings that most closely match the job seeker's resume or, conversely, permitting an employer to search for job seekers whose resumes most closely match the employer's job posting.
    Type: Grant
    Filed: August 8, 2007
    Date of Patent: January 24, 2012
    Assignee: CareerBuilder, LLC
    Inventors: Andrew B. Cranfill, Jason Elliott
  • Publication number: 20120011132
    Abstract: A method of preparing data for analysis, comprising the steps of receiving an initial data set including a plurality of records, each of the plurality of records including an identifier attribute and an associative attribute that identifies a further one or more records; receiving the further one or more records identified by the associative attribute in each of the plurality of records; and associating the further one or more records with the initial data set to form a final data set.
    Type: Application
    Filed: July 8, 2011
    Publication date: January 12, 2012
    Applicant: Patent Analytics Holding Pty Ltd
    Inventor: Doris Spielthenner
  • Patent number: 8095533
    Abstract: Disclosed are methods and systems for automatically assigning index terms to electronic documents such as Web pages or sites in a manner which may be used to facilitate the retrieval of electronic documents of interest. The method involves determining co-occurrences of terms in other documents with the electronic document, and selecting terms as index terms based upon those scores. The method permits the efficient retrieval of electronic documents.
    Type: Grant
    Filed: November 9, 2004
    Date of Patent: January 10, 2012
    Assignee: Apple Inc.
    Inventor: Jay Michael Ponte
  • Patent number: 8095529
    Abstract: A method and system for ranking relevancy of metadata associated with media on a computer network, such as multimedia and streaming media, include categorizing the metadata into sets of metadata. The categories are broad categories relating to areas such as who, what, when, and where, such as artist, media type, and creation date, creation location. Weights are assigned to each set of metadata. Weights are related to technical information such as bit rate, duration, sampling rate, frequency of occurrence of a specific term, etc. A score is calculated for ranking the relevancy of each set of metadata. The score is calculated in accordance with the assigned weight and category. This score is available for search systems (e.g., search engines) and/or users to determine the relative ranking of search results.
    Type: Grant
    Filed: January 4, 2005
    Date of Patent: January 10, 2012
    Assignee: AOL Inc.
    Inventors: Theodore George Diamond, Daniel Allen Hendrick, Eric Carl Rehm, Melissa Anne Riesland
  • Patent number: 8095544
    Abstract: In a method for validating data, a text of a document is received. At least one fact is extracted from the text. At least one expert refinement is merged with the at least one fact to create at least one modified fact. The at least one modified fact is provided for a review. An expert refinement to the at least one modified fact is captured in response to the review. A superset document based on the at least one pre-existing refinement and the expert refinement is stored.
    Type: Grant
    Filed: May 30, 2003
    Date of Patent: January 10, 2012
    Assignee: Dictaphone Corporation
    Inventors: Keith W. Boone, Sunitha Chaparala, Sean Gervais, Robert G. Titemore, Harry J. Ogrinc, Jeffrey G. Hopkins, Roubik Manoukian, Cameron Fordyce
  • Patent number: 8095546
    Abstract: Methods, systems, and apparatus, including computer program products are provided for ranking distinct book content items based on implicit links to other distinct book content items. The implicit links are defined based on the identification of matching features in the distinct book content items. In some implementations, the matching features are uncommon phrases in textual content of the distinct book content items. Edges representing implicit links are generated between distinct nodes representing distinct book content items in a weighted graph. Search results for distinct book content items can be ordered based on the edges connected to the distinct nodes in the weighted graph that represent the distinct book content items.
    Type: Grant
    Filed: January 9, 2009
    Date of Patent: January 10, 2012
    Assignee: Google Inc.
    Inventors: Shumeet Baluja, Yushi Jing
  • Patent number: 8095543
    Abstract: In various embodiments, a method for determining a similarity between two data sets is disclosed, the steps of which include determining a first list of data clusters for a first hierarchically-organized data set, determining a second list of data clusters for a second hierarchically-organized data set, and determining a similarity between the first and second data sets by calculating a maximum flow between the first list of data clusters and the second list of data clusters.
    Type: Grant
    Filed: July 31, 2008
    Date of Patent: January 10, 2012
    Assignee: The United States of America as represented by the Secretary of the Navy
    Inventor: Anjum Gupta
  • Patent number: 8095545
    Abstract: Techniques for query processing in a multi-site search engine are described. During an indexing phase, each site of a multi-site search engine indexes a set of assigned web resources and each site calculates, for each term in the set of assigned web resources, a site-specific upper bound ranking score on the contribution of the term to the search engine ranking function for a query containing the term. During a propagation phase, all sites exchange their site-specific upper bound ranking scores with each other. In response to a site receiving a query, the site determines the set of locally matching resources and compares the ranking score of a locally matching resource with the site-specific upper bound ranking scores for the terms of the query that were received during the propagation phase and determines whether to communicate the query to other sites.
    Type: Grant
    Filed: October 14, 2008
    Date of Patent: January 10, 2012
    Assignee: Yahoo! Inc.
    Inventors: Luca Telloli, Flavio Junqueria, Aristides Gionis, Vassilis Plachouras, Ricardo Baeza-Yates
  • Patent number: 8090722
    Abstract: Systems, methods, and other embodiments associated with logically expanding a document and determining the relevance of the logically expanded document to a query are described. One method embodiment includes searching an index to locate a document identifier for a document in which a query term appears. The method includes determining whether the index entry includes an expansion identifier, and, if so, producing a logically expanded document. The logically expanded document may include both a document associated with the document identifier and a document associated with the expansion identifier. The method may then determine a relevance value of the logically expanded document with respect to the query and may provide a signal corresponding to the relevance value.
    Type: Grant
    Filed: March 21, 2007
    Date of Patent: January 3, 2012
    Assignee: Oracle International Corporation
    Inventors: Muralidhar Krishnaprasad, Meeten Bhavsar
  • Patent number: 8090724
    Abstract: A term analyzer receives an ordered collection of text-based terms. The ordered collection can contain terms from a document that have been filtered to remove “noise” such as stopwords. The term analyzer analyzes groupings of consecutive text-based terms in the ordered collection to identify occurrences of different combinations of text-based terms in the ordered collection. In addition, the term analyzer maintains frequency information representing the occurrences of the different combinations of text-based terms in the collection. The frequency information can then be used to determine relatively significant keywords and/or keyword phrases in the document. In an example configuration, the term analyzer creates a tree in which a first term in a given grouping of the groupings is defined as a parent node in the tree and a second term in the given grouping is defined as a child node of the parent node in the tree.
    Type: Grant
    Filed: November 28, 2007
    Date of Patent: January 3, 2012
    Assignee: Adobe Systems Incorporated
    Inventors: Michael J. Welch, Walter Chang
  • Patent number: 8090725
    Abstract: A system and method is described for receiving a plurality of non-standardized data sets and generating respective plurality of standardized profiles that can be used for efficiently comparing and matching one profile against the other plurality of profiles. One application of this invention is to convert job seekers' resumes and job postings into respective profiles and then permitting either a job seeker to search for job postings that most closely match the job seeker's resume or, conversely, permitting an employer to search for job seekers whose resumes most closely match the employer's job posting.
    Type: Grant
    Filed: April 16, 2010
    Date of Patent: January 3, 2012
    Assignee: CareerBuilder, LLC
    Inventor: Andrew B. Cranfill