Based On Term Frequency Of Appearance Patents (Class 707/750)
  • Publication number: 20130007021
    Abstract: A linkage information output apparatus includes: a linkage information retrieval unit for acquiring, upon receiving source information, destination information linked with the source information, a frequency of occurrence of the source information, a frequency of occurrence of linked each of the destination information, and a frequency of occurrence of a link of the source information and each of the destination information from a linkage information accumulation unit; a recognition degree calculation unit calculating, based on each acquired frequency of occurrence, a recognition degree of the source information, a recognition degree of each acquired destination information, and a recognition degree of each link; and a high interest information narrowing unit selecting destination information to output from among each destination information based on a combination of two or more among a recognition degree of the source information, a recognition degree of the destination information, and a recognition degree
    Type: Application
    Filed: December 28, 2010
    Publication date: January 3, 2013
    Applicant: NEC CORPORATION
    Inventors: Hironori Mizuguchi, Yukitaka Kusumura, Yusuke Muraoka, Dai Kusui
  • Patent number: 8346629
    Abstract: A method for improving media search capability includes providing a user with access to an interface that allows the user to provide one or more inputs relating to an item of media (such as an audio or video recording of a song or a cover song), performing a media search in response to the one or more inputs, and presenting search results via an interactive display generated depending upon media ratings, wherein one or more of the media ratings is determined from media ratings inputs depending upon one or more metrics associated with sources or providers of the media ratings inputs.
    Type: Grant
    Filed: January 3, 2012
    Date of Patent: January 1, 2013
    Inventors: Joshua Beroukhim, Joseph Michael
  • Patent number: 8346759
    Abstract: Provided are a system and article of manufacture for searching documents for ranges of numeric values. A number of posting lists is generated, wherein each posting list is associated with a range of consecutive values within the set of values and includes document identifiers for documents including at least one value within the range of consecutive values associated with the posting list, and wherein each document identifier is associated with one value in the set of values included in the document identified by the document identifier. The generated posting lists are stored. A query on a query range of values within the set of values is received and a determination is made of a minimum number of posting lists associated with consecutive values that together include the query range of values. The determined posting lists are merged.
    Type: Grant
    Filed: August 6, 2008
    Date of Patent: January 1, 2013
    Assignee: International Business Machines Corporation
    Inventors: Marcus Felipe Fontoura, Ronny Lempel, Runping Qi, Jason Yeong Zien
  • Publication number: 20120330938
    Abstract: A method and document separation system for separating a set of related documents is described. In one aspect, the method comprises: determining, on a document selection system, quality scores for a plurality of the documents in the set of related documents; obtaining a similarity score for a plurality of pairs of documents in the set of related document; and on a document selection system, obtaining a first subset of related documents which solves an optimization problem, the first subset of related documents including a portion of the document in the set of related documents, the optimization problem being a function of one or more quality scores of the documents assigned to the first subset of related documents and one or more similarity scores of pairs of documents assigned to the first subset of related documents.
    Type: Application
    Filed: October 5, 2011
    Publication date: December 27, 2012
    Applicant: ROGERS COMMUNICATIONS INC.
    Inventors: Hyun Chul LEE, Darius BRAZIUNAS, Michael CVET
  • Patent number: 8335787
    Abstract: A method of, and system for, extracting topic words from a collection of documents across multiple and potentially very large number of domains. Documents are selected and ranked based on similarity with at least one seed word, which defines a topic. Seed words may be entered directly by a user or provided by another application. Keywords are extracted from documents determined to be a sufficiently good match to the topic and may be displayed to the user or used as input into word prediction or word analysis and display software. Documents are determined to be a sufficiently good match to the topic using an iterative algorithm starting with the best match and selecting documents containing keywords sufficiently similar to the previously selected documents.
    Type: Grant
    Filed: November 7, 2008
    Date of Patent: December 18, 2012
    Assignee: Quillsoft Ltd.
    Inventors: Fraser Shein, Tom C Nantais, Dan Li
  • Publication number: 20120317126
    Abstract: A system, method and computer program product for identifying near and exact-duplicate documents in a document collection, including for each document in the collection, reading textual content from the document; filtering the textual content based on user settings; determining N most frequent words from the filtered textual content of the document; performing a quorum search of the N most frequent words in the document with a threshold M; and sorting results from the quorum search based on relevancy. Based on the values of N and M near and exact-duplicate documents are identified in the document collection.
    Type: Application
    Filed: August 16, 2012
    Publication date: December 13, 2012
    Applicant: MSC INTELLECTUAL PROPERTIES B.V.
    Inventors: Johannes C. Scholtes, Siebe Bloembergen
  • Patent number: 8332379
    Abstract: A method and system for identifying nodes with similar content. In one aspect, the method comprises determining a structure of a network of nodes, said structure defined by incoming links and outgoing links between nodes within said network, grouping said nodes within said network into a first set of modules, calculating a first modularity value between each of the modules within the first set, said modularity value indicating a degree of similar content within each module, calculating a topical relevance value for each of the modules, selecting those modules whose topical relevance value exceeds a threshold value and calculating an authority score for the selected modules.
    Type: Grant
    Filed: June 11, 2010
    Date of Patent: December 11, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ning Duan, Pei-Yun S. Hsueh, Yan Liu
  • Patent number: 8332399
    Abstract: A system identifies a set of documents from a corpus of documents that are relevant to a word, phrase or sentence and that were published at approximately a same time period, where each document of the set of documents includes news content and has an associated headline. The system extracts headlines from the set of documents and derives a score for each headline of the extracted headlines based on how many times selected words in each headline occurs among all of the extracted headlines.
    Type: Grant
    Filed: May 4, 2010
    Date of Patent: December 11, 2012
    Assignee: Google Inc.
    Inventor: Douwe Osinga
  • Patent number: 8332412
    Abstract: A system that incorporates teachings of the present disclosure may include, for example network device having a controller to receive multiple streams of content for portions of a multimedia work (MMW), perform a high level analysis for features in each of the streams for the MMW, perform a specialized analysis on the portion having a detected general feature to generate a content analysis output, correlate the content analysis output with other content analysis of the MMW, and output a weighted content description based on the correlation function. Other embodiments are disclosed.
    Type: Grant
    Filed: October 21, 2009
    Date of Patent: December 11, 2012
    Assignee: AT&T Intellectual Property I, LP
    Inventors: Andrea Basso, Gustavo De Los Reyes
  • Patent number: 8332409
    Abstract: A content device may select associated content, such as adverts, for a user selected content item based on textual characterizing data for the associated content and the user selected content item. A term set characterizing the user selected content item is expanded using semantic graphs and similarity values between the expanded term set and term sets describing associated content is calculated. A specific associated content item is then selected based on the similarity values. The semantic graph based term set expansion may allow improved accuracy in selecting appropriate associated content while providing a process that is suitable for resource constrained scenarios. In particular, communication resource, memory resource, and computational resource usage may be kept low.
    Type: Grant
    Filed: August 24, 2009
    Date of Patent: December 11, 2012
    Assignee: Motorola Mobility LLC
    Inventors: Simon Waddington, Ben M. Bratu, Ioannis Kompatsiaris, Fotis Menemenis, Symeon Papadopoulos
  • Patent number: 8326830
    Abstract: Described herein are methods and systems for pattern recognition in web search engine result pages. The input data is a result page from a web search engine as well as an integer number for the results on the page. The output is a regular expression that matches all the results on the page, capturing each result and its individual fields.
    Type: Grant
    Filed: October 6, 2009
    Date of Patent: December 4, 2012
    Assignee: Business Objects Software Limited
    Inventor: Daniel Hollingsworth
  • Patent number: 8325974
    Abstract: Named entity recognition is applied to identify text strings corresponding to character identities in a written work. The textual strings are grouped according to character identity and, from each group, a primary name is selected. A significance is calculated for each of the character identities. The character identities including the primary names are presented in a catalog based on the calculated significance. In some embodiments, character identity identification results are refined by allowing users to vote regarding the significance of the character identities and by granting more weight to the votes of users with a close relationship to the written work.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: December 4, 2012
    Assignee: Amazon Technologies Inc.
    Inventors: Tom Killalea, Janna S. Hamaker, Eugene Kalenkovich
  • Publication number: 20120303637
    Abstract: Method, system, and computer program product for automatic generation of a word-cloud for a content item are provided. The method includes: extracting terms from a content item using statistical selection criteria; weighting a term by a probability that the term is used as a tag; and generating a visual representation of terms with enhanced representation of terms according to the weighting. Weighting a term by a probability that the term is used as a tag may include determining the relative frequency of the term in a folksonomy of tag terms for a domain.
    Type: Application
    Filed: May 23, 2011
    Publication date: November 29, 2012
    Applicant: International Business Machines Corporation
    Inventors: David Carmel, Ido Guy, Yosi Mass, Haggai Roitman, Erel Uziel
  • Patent number: 8321410
    Abstract: A search engine for searching a corpus improves the relevancy of the results by classifying multiple terms in a search query as a single semantic unit. A semantic unit locator of the search engine generates a subset of documents that are generally relevant to the query based on the individual terms within the query. Combinations of search terms that define potential semantic units from the query are then evaluated against the subset of documents to determine which combinations of search terms should be classified as a semantic unit. The resultant semantic units are used to refine the results of the search.
    Type: Grant
    Filed: June 18, 2007
    Date of Patent: November 27, 2012
    Assignee: Google Inc.
    Inventors: Krishna Bharat, Sanjay Ghemawat, Urs Hoelzle
  • Patent number: 8321204
    Abstract: A system for generating a lexicon of words, organized into weighted categories, from a user defined set of example documents for detecting suspicious e-mails from a mail archive is provided. The system uses a set of example documents and e-mails given by the user to probabilistically find possible lists of critical words. The obtained list is now applied on an archive of e-mails. The system generates an inverted index on the mails from the archive to facilitate search for the key phrases. User feedback is taken on the results obtained and corrections in the lexicon made if necessary. Thus, the mails are scanned based on user feedback, user defined words and automatically generated word list. These lists constantly adapt as e-mails in the archive change. The system then combines all these to present the user with several possible sets of keywords and their relative importance that can be used as a policy for a desired level of accuracy. The system also shows the user any change if the set is modified.
    Type: Grant
    Filed: March 16, 2009
    Date of Patent: November 27, 2012
    Inventors: Malathi Kalyan, Abhi Dattasharma, Ram Lokendar Singh, Sanjay Subhash Panakkal, Dinesh Rama Hedge, Vinay Manjunath Naik
  • Patent number: 8321425
    Abstract: To improve traditional keyword based search engines, the present inventors devised, among other things, systems, methods, and software that use word co-occurrence probabilities not only to identify documents conceptually related to user queries, but also to score and rank search results. One exemplary system combines inverse-document-frequency searching with concept searching based on word co-occurrence probabilities to facilitate finding of documents that would otherwise go unfound using a given query. The exemplary system also allows ranking of search results based both on both keyword matching and concept presence, promoting more efficient organization and review of search results.
    Type: Grant
    Filed: August 22, 2008
    Date of Patent: November 27, 2012
    Assignee: Thomson Reuters Global Resources
    Inventors: Tonya Custis, Khalid Al-Kofahi
  • Patent number: 8316016
    Abstract: A system facilitates a search by a user. The system detects selection of one or more words in a document currently accessed by the user, generates a search query using the selected word(s), and retrieves a document based on the search query. When the document includes one or more links corresponding to a linked document, the system analyzes each of the links, prefetches the linked documents corresponding to a number of the links, and presents the document to the user. The system receives selection of one of the links and retrieves the linked document corresponding to the selected link. The system identifies one or more pieces of information in the retrieved document, determines a link to a related document for each of the identified pieces of information, and provides the determined links with the related document to the user.
    Type: Grant
    Filed: September 26, 2011
    Date of Patent: November 20, 2012
    Assignee: Google Inc.
    Inventors: Urs Hoelzle, Monika H Henzinger, Lawrence E Page
  • Publication number: 20120290407
    Abstract: A process is described for assessing the suitability of particular keyword phrases for use in serving contextually relevant content for display on pages of network-accessible sites. In one embodiment, the process involves scoring the key phrases based in part on collected user behavioral data, such as view counts of associated social media content items. A process is also disclosed in which selected keyword phrases on a page are transformed into links that can be selected by a user to view bundled content that is related to such keyword phrases.
    Type: Application
    Filed: May 22, 2012
    Publication date: November 15, 2012
    Inventors: Sid JA. Hubbard, Robin Stevens
  • Publication number: 20120271828
    Abstract: In one implementation, a method includes receiving a request for translation of one or more first keywords from a source language to a target language; and translating, using a machine translation process, the first keywords from the source language into a plurality of second keywords in the target language. The method can also include determining, by a computer system, frequencies with which each of the second keywords occur in a corpus associated with the target language. The method can further include selecting, by the computer system, a subset of the second keywords to use in the target language based on the determined frequencies of occurrence.
    Type: Application
    Filed: April 21, 2011
    Publication date: October 25, 2012
    Applicant: Google Inc.
    Inventor: Mandayam Thondanur Raghunath
  • Patent number: 8290927
    Abstract: Generally, a method and apparatus provides for rating user generated content (UGC) with respect to search engine results. The method and apparatus includes recognizing a UGC data field collected from a web document located at a web location. The method and apparatus calculates: a document goodness factor for the web document; an author rank for an author of the UGC data field; and a location rank for web location. The method and apparatus thereby generates a rating factor for the UGC field based on the document goodness factor, the author rank and the location rank. The method and apparatus also outputs a search result that includes the UGC data field positioned in the search results based on the rating factor.
    Type: Grant
    Filed: April 19, 2011
    Date of Patent: October 16, 2012
    Assignee: Yahoo! Inc.
    Inventors: Jaya Kawale, Aditya Pal
  • Patent number: 8290975
    Abstract: A keyword may be expanded into related words, such as for use in information retrieval. The terms comprising words and/or phrases of a large number of documents (e.g., web pages) are processed into a graph data structure, in which the terms are represented as nodes and edges represent the relationships between the nodes, with weights for each edge representing the relevance of the relationship. The graph may be built by selecting each term of a document and considering the terms within a certain number of words to be associated with the selected term; for each such association the weight indicative of the relevance is increased. When the graph is accessed with a keyword, the edges from that keyword's node and their respective weights indicate which other nodes are most relevant to the keyword, thereby providing the corresponding expanded terms.
    Type: Grant
    Filed: March 12, 2008
    Date of Patent: October 16, 2012
    Assignee: Microsoft Corporation
    Inventors: Chi Gao, Mingyu Wang, Weibin Zhu
  • Patent number: 8290946
    Abstract: Two methods for measuring keyword-document relevance are described. The methods receive a keyword and a document as input and output a probability value for the keyword. The first method is a similarity-based approach which uses techniques for measuring similarity between two short-text segments to measure relevance between the keyword and the document. The second method is a regression-based approach based on an assumption that if an out-of-document phrase (the keyword) is semantically similar to an in-document phrase, then relevance scores of the in and out-of document phrases should be close to each other.
    Type: Grant
    Filed: June 24, 2008
    Date of Patent: October 16, 2012
    Assignee: Microsoft Corporation
    Inventors: Wen-tau Yih, Christopher A. Meek
  • Patent number: 8290963
    Abstract: Methods and systems for identification of paraphrases from an index of information items and associated sentence fragments are described. One method described comprises identifying a pair of sentence fragments each having a same associated information item from an index, wherein the index comprises a plurality of information items and associated sentence fragments, and identifying a paraphrase pair from the pair of sentence fragments.
    Type: Grant
    Filed: May 2, 2011
    Date of Patent: October 16, 2012
    Assignee: Google Inc.
    Inventors: Alexandru Marius Pasca, Peter Szabolcs Dienes
  • Patent number: 8285737
    Abstract: Among other disclosed subject matter, a computer-implemented method relating to selecting content for publication includes receiving a term to be used in selecting content for publication. The method includes obtaining information from a record using the received term, the information reflecting a correspondence between contents in a repository and the received term. The method includes determining, using at least the obtained information, a query to be performed on the repository for selecting at least part of the content.
    Type: Grant
    Filed: April 10, 2008
    Date of Patent: October 9, 2012
    Assignee: Google Inc.
    Inventors: Nicholas Lynn, Alexander P. Carobus
  • Publication number: 20120253944
    Abstract: A computing device receives, over a network, information regarding word phrases (e.g., search terms) and determines longevity values associated with content built around the word phrases. The computing device selects, based on the longevity values, a first phrase from the word phrases. Content is built or created around the first phrase, and the built or created content is presented or published over a network such as the Internet.
    Type: Application
    Filed: May 24, 2012
    Publication date: October 4, 2012
    Applicant: DEMAND MEDIA, INC.
    Inventor: Byron William Reese
  • Patent number: 8280893
    Abstract: Methods and systems for identification of paraphrases from an index of information items and associated sentence fragments are described. One method described comprises identifying a pair of sentence fragments each having a same associated information item from an index, wherein the index comprises a plurality of information items and associated sentence fragments, and identifying a paraphrase pair from the pair of sentence fragments.
    Type: Grant
    Filed: May 2, 2011
    Date of Patent: October 2, 2012
    Assignee: Google Inc.
    Inventors: Alexandru Marius Pasca, Peter Szabolcs Dienes
  • Publication number: 20120246177
    Abstract: A content item, e.g., an icon or advertisement content, is selected for placement in a display environment (e.g., on a map or adjacent to a map) in response to a request for the display environment based on a probability that the content item is relevant to a user that is requesting the display environment. The selection is facilitated by content targeting data (e.g., feature selection and query submission) that can be received from user devices while the map space is presented.
    Type: Application
    Filed: June 11, 2012
    Publication date: September 27, 2012
    Applicant: GOOGLE INC.
    Inventors: Michael Perrow, James Robert Macgill, Dana Zhang, Nicholas Verne, David Symonds
  • Patent number: 8271499
    Abstract: In embodiments of the disclosed technology, indexes, such as inverted indexes, are updated only as necessary to guarantee answer precision within predefined thresholds which are determined with little cost in comparison to the updates of the indexes themselves. With the present technology, a batch of daily updates can be processed in a matter of minutes, rather than a few hours for rebuilding an index, and a query may be answered with assurances that the results are accurate or within a threshold of accuracy.
    Type: Grant
    Filed: June 10, 2009
    Date of Patent: September 18, 2012
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Marios Hadjieleftheriou, Nick Koudas, Divesh Srivastava
  • Publication number: 20120226696
    Abstract: In various embodiments, a transcript that represents a media file is created. Keyword candidates that may represent topics and/or content associated with the media content are then be extracted from the transcript. Furthermore, a keyword set may be generated for the media content utilizing a mutual information criteria. In other embodiments, one or more queries may be generated based at least in part on the transcript, and a plurality of web documents may be retrieved based at least in part on the one or more queries. Additional keyword candidates may be extracted from each web document and then ranked. A subset of the keyword candidates may then be selected to form a keyword set associated with the media content.
    Type: Application
    Filed: March 4, 2011
    Publication date: September 6, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Albert Joseph Kishan Thambiratnam, Sha Meng, Gang Li, Frank Torsten Bernd Seide
  • Publication number: 20120221548
    Abstract: A system and method are disclosed for determining the geographic range of a search query. A search query may include local intent which influences the results and advertisements that are displayed in response to the search query. The geographic range associated with the local intent may vary depending on the search query. The geographic range may be determined using probabilistic models that analyze historical searches to determine the geographic range of search queries.
    Type: Application
    Filed: April 30, 2012
    Publication date: August 30, 2012
    Applicant: YAHOO! INC.
    Inventors: Jim W. Delli Santi, Ramazan Demir
  • Publication number: 20120215796
    Abstract: Systems and methods facilitate a search and identify documents and associated metadata reflecting content of the documents. In one implementation, a method receives a query comprising a set of search terms, identifies a stored document in response to the query, and determines a score value for the retrieved document based on a similarity between one or more of the query search terms and metadata associated with the identified document. The method locates the identified document in a citation network of baseline query results, the citation network comprising a first set of documents that cite to the identified document and a second set of documents cited to by the identified document. The method further determines a new score value of the identified document as a function of the score value and a quantity and a quality of documents within the first and second set of documents.
    Type: Application
    Filed: February 23, 2012
    Publication date: August 23, 2012
    Inventors: Ling Qin Zhang, Harry R. Silver
  • Patent number: 8250079
    Abstract: A system, method and computer program product for identifying near and exact-duplicate documents in a document collection, including for each document in the collection, reading textual content from the document; filtering the textual content based on user settings; determining N most frequent words from the filtered textual content of the document; performing a quorum search of the N most frequent words in the document with a threshold M; and sorting results from the quorum search based on relevancy. Based on the values of N and M near and exact-duplicate documents are identified in the document collection.
    Type: Grant
    Filed: March 30, 2011
    Date of Patent: August 21, 2012
    Assignee: MSC Intellectual Properties B.V.
    Inventors: Johannes C. Scholtes, Siebe Bloembergen
  • Publication number: 20120209861
    Abstract: A method of operation of a navigation system includes: generating a point of interest term from an uncategorized point of interest; applying a statistical rule to the point of interest term to generate a category score for the point of interest term; determining a normalized category score based on the category score and on matching the point of interest term and the uncategorized point of interest; and generating a category identifier for the uncategorized point of interest based on the normalized category score being highly ranked for displaying on a device.
    Type: Application
    Filed: February 15, 2011
    Publication date: August 16, 2012
    Applicant: TELENAV, INC.
    Inventors: Pramod Lakshmi Narasimha, Aliasgar Mumtaz Husain, Thu-Phuong Tuong Do
  • Patent number: 8239382
    Abstract: A method for creating an index of network data for a set of message data, the index being arranged for searching the set of message data. A method in accordance with an embodiment of the invention includes: creating a set of dialogue records, where each the dialogue record is the set of messages corresponding to a dialogue between a sender and recipient pair in a message corpus; —logging each of the set of messages in each corresponding dialogue record; and creating an index of terms from the set of messages, the index being arranged to index each term to each dialogue record in which the message comprising the respective term is logged.
    Type: Grant
    Filed: June 24, 2008
    Date of Patent: August 7, 2012
    Assignee: International Business Machines Corporation
    Inventor: Stephen A. Davies
  • Publication number: 20120197910
    Abstract: A system and method for efficiently and accurately identifying relevant document classifications is contemplated. The document analysis system receives classified reference documents along with a relevancy indicator for each document and generates sensory indicators that assist a researcher in identifying relevant classifications that have not been previously researched. In one aspect, the document analysis system generates a table of classifications, the classifications being determined by scoring of each classification cited within each relevant document. The system then determines a sensory indicator (e.g. a color) for each classification that indicates the extent to which the classification has been previously searched. The classification analysis window thus allows the researcher to quickly determine (e.g. by visual inspection) which classification codes have been cited most frequently as well as which classification codes require further search.
    Type: Application
    Filed: October 12, 2010
    Publication date: August 2, 2012
    Inventor: Patrick Sander Walsh
  • Patent number: 8234584
    Abstract: Provided is a computer system including an information providing server and a computer which is coupled to the information providing server, and which collects information, the computer being configured to: record status histories including a history of an operation to a screen which shows a status of the computer, and which is displayed on the computer in chronological order to constitute a set of the status histories; and divide, in a case where a history of an operation of switching the screen is detected from the set of the status histories, based on the history of the operation of switching the screen, the set of the status histories. Accordingly, when a user collects information, navigation information is provided by taking the fact that the user has actually reached useful information into consideration.
    Type: Grant
    Filed: February 18, 2009
    Date of Patent: July 31, 2012
    Assignee: Hitachi, Ltd.
    Inventors: Masahiro Motobayashi, Toshio Okochi, Michiko Sakai, Maki Hayashi, Akio Azuma
  • Patent number: 8229734
    Abstract: An intelligent query system for processing voiced-based queries is disclosed, which uses semantic based processing to identify the question posed by the user by understanding the meaning of the user's utterance. Based on identifying the meaning of the utterance, the system selects a single answer that best matches the user's query. The answer that is paired to this single question is then retrieved and presented to the user. The system, as implemented, accepts environmental variables selected by the user and is scalable to provide answers to a variety and quantity of user-initiated queries.
    Type: Grant
    Filed: June 23, 2008
    Date of Patent: July 24, 2012
    Assignee: Phoenix Solutions, Inc.
    Inventor: Ian Bennett
  • Publication number: 20120179696
    Abstract: A system and process for tagging electronic documents or other electronic content with concepts mentioned, contained, or otherwise described in that content. Once tagged, the content may be searchable, indexable, and retrievable in order to provide that content to an end user or another recipient. The system may be configured to handle a considerable number of asset files and a large number of users, workflows, and access applications simultaneously. The system may auto-tag the content and also may include a user interface for confirming and updating those tags and for manually creating new or additional tags. Content may include documents such as medical documents relating to procedures, diagnoses, medications or other domains. Alternatively, the content may include information about various care providers, in order to allow a user to locate a physician meeting one or more desired criteria.
    Type: Application
    Filed: January 11, 2011
    Publication date: July 12, 2012
    Applicant: INTELLIGENT MEDICAL OBJECTS, INC.
    Inventors: Regis CHARLOT, Frank NAEYMI-RAD, Alina OGANESOVA, Andre YOUNG, Andrei NAEYMI-RAD, Aziz BODAL, David HAINES, Jose MALDONADO, Masayo KOBASHI, Stephanie SCHAEFER
  • Publication number: 20120173550
    Abstract: The present invention is a system and method to improve the impact of marketing messages broadcasted to various web communities. Marketing communication keywords that are predefined are matched against tags set by private and public user's tagging communities. Semantic analysis is applied on the keywords and the tags and resulting associations allow determining relevance of marketing keywords. Matches indicate where marketing people have met their goals while matching gaps indicate marketing messages have not been perceived by the companies or the market. Valuable feedback is thus obtained to help re-enforce the initial messages that were not received or to replace the message wording by the one perceived from the identified market tags.
    Type: Application
    Filed: June 1, 2010
    Publication date: July 5, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Didier Boullery, Hisham E. El-Shishiny
  • Patent number: 8214375
    Abstract: A user data engine records profile data supplied by a user and usage data that is based on interactions between the user and a software application. A group data engine receives a set of user data comprising the profile data and the usage data for each user of a plurality of users. The group data engine determines a similarity value between each pair of users of the software application. The group data engine identifies groups of similar users based on the similarity values by executing one or more clustering algorithms. A user may then search for other users and groups of users of the software application and may then receive information from the users and/or groups of users that is related to use of the software application.
    Type: Grant
    Filed: March 6, 2009
    Date of Patent: July 3, 2012
    Assignee: Autodesk, Inc.
    Inventors: George Fitzmaurice, Tovi Grossman, Justin Frank Matejka, Wei Li
  • Patent number: 8209336
    Abstract: The invention relates to a device and a method for processing information of a database. The device consists of means for pre-defining at least one related area in a representation space comprising positions that can receive elements that are representative of the data, said space including at least one complementary area having no data representation. It also comprises means for specifying at least one data bootstrapping element for each related area, positioning the bootstrapping elements at bootstrapping positions in the related areas, successively determining new elements from elements already positioned, in accordance with at least one proximity order relation based on contents of the data, and successively positioning the new elements at positions neighboring the positions occupied by the data elements already positioned.
    Type: Grant
    Filed: November 26, 2004
    Date of Patent: June 26, 2012
    Assignee: Thomson Licensing
    Inventors: Nadine Patry, David Bihanic, Thierry Viellard
  • Publication number: 20120158747
    Abstract: Systems and methods for performing authority based content searching are disclosed. In some embodiments, a method comprises receiving user queries containing authority keywords and relevancy keywords and ranking a set of search results on the basis of the authority of the authors of entries within the search results. The authority of each author is expressed in an authority quotient which is calculated by determining an authority keyword score, a name score, a domain name score and a credential score based on the authority keyword provided by the user.
    Type: Application
    Filed: December 16, 2011
    Publication date: June 21, 2012
    Inventors: Michael Satow, Jack Mitchel Widman
  • Patent number: 8200678
    Abstract: A computing device receives, over a network, information regarding word phrases (e.g., search terms) and determines longevity values associated with content built around the word phrases. The computing device selects, based on the longevity values, a first phrase from the word phrases. Content is built or created around the first phrase, and the built or created content is presented or published over a network such as the Internet.
    Type: Grant
    Filed: July 20, 2011
    Date of Patent: June 12, 2012
    Assignee: Demand Media, Inc.
    Inventor: Byron William Reese
  • Publication number: 20120143881
    Abstract: Relay of information from technical documentation by contact center workers to assist clients is limited by industry standard storage formats and query mechanisms. A method is disclosed for processing technical documents and tagging them against a Telecom Hardware domain ontology. The method comprises classical ontological Natural Language Processing (NLP) approaches to extract information from both text segments and tables, identifying text segments, named entities and relations between named entities described by an existing T-Box. A method for scoring candidate object property assertions derived from text before populating the Telecom Hardware ontology is also disclosed.
    Type: Application
    Filed: December 5, 2011
    Publication date: June 7, 2012
    Applicant: INNOVATIA INC.
    Inventors: Christopher Baker, ALEXANDER KOUZNETSOV
  • Publication number: 20120143860
    Abstract: The present invention extends to methods, systems, and computer program products for identifying key phrases within documents. Embodiments of the invention include using a tag index to determine what a document primarily relates to. For example, an integrated data flow and extract-transform-load pipeline, crawls, parses and word breaks large corpuses of documents in database tables. Documents can be broken into tuples. The tuples can be sent to a heuristically based algorithm that uses statistical language models and weight+cross-entropy threshold functions to summarize the document into its “top N” most statistically significant phrases. Accordingly, embodiments of the invention scale efficiently (e.g., linearly) and (potentially large numbers of) documents can be characterized by salient and relevant key phrases (tags).
    Type: Application
    Filed: December 3, 2010
    Publication date: June 7, 2012
    Applicant: Microsoft Corporation
    Inventors: Sorin Gherman, Kunal Mukerjee
  • Patent number: 8195670
    Abstract: Disclosed are systems for, and methods of, automatically detecting and treating field values of a particular field as null field values in records of a database. The system and method provide automatic treatment of these field values as null field values by calculating a critical frequency for the field. Based on the critical frequency of the field, the system and method treats field values that occur more than the critical frequency of the field as null field values and treats field values that occur less than the critical frequency as non-null field values.
    Type: Grant
    Filed: April 24, 2009
    Date of Patent: June 5, 2012
    Assignee: LexisNexis Risk & Information Analytics Group Inc.
    Inventor: David Alan Bayliss
  • Patent number: 8195660
    Abstract: Various embodiments described herein provide systems, methods, and software to automatically reorder search results presented to users based on information specific to the user or the computing environment of the user. Some embodiments include a data store holding user or environment specific data that is used to identify search results that are more likely to be relevant to the user. These and other embodiments are described in greater detail herein.
    Type: Grant
    Filed: June 29, 2007
    Date of Patent: June 5, 2012
    Assignee: Intel Corporation
    Inventors: Barbara Rosario, William Noah Schilit
  • Patent number: 8189963
    Abstract: Systems, methods, and computer-readable media for matching a visual media object to an advertisement are provided. Embodiments of the present invention include receiving un-categorized visual media objects, automatically categorizing received visual media objects into subject-matter categories using image recognition technology, and retrieving advertisements assigned to the same subject-matter category for presentation in association therewith.
    Type: Grant
    Filed: November 13, 2007
    Date of Patent: May 29, 2012
    Assignee: Microsoft Corporation
    Inventors: Li Li, Brian Burdick, Sachin Dhawan, Ewa Dominowska
  • Publication number: 20120131021
    Abstract: Disclosed herein is a method, a system and a computer product for generating a snippet for an entity, wherein each snippet comprises a plurality of sentiments about the entity. One or more textual reviews associated with the entity is selected. A plurality of sentiment phrases are identified based on the one or more textual reviews, wherein each sentiment phrase comprises a sentiment about the entity. One or more sentiment phrases from the plurality of sentiment phrases are selected to generate a snippet.
    Type: Application
    Filed: June 24, 2011
    Publication date: May 24, 2012
    Inventors: Sasha Blair-Goldensohn, Kerry Hannan, Ryan McDonald, Tyler Neylon, Jeffrey C. Reynar
  • Publication number: 20120130978
    Abstract: Methods, systems, and apparatus, including computer program products, for presenting search query suggestions. In an aspect, query triggers in a resource are identified at a client device. For each query trigger identified in the resource, a rank score for the query trigger based on query trigger attributes is calculated at the client device. The query triggers are ranked at the client device based on the rank scores. Search query suggestions are generated at the client device from the query triggers identified in the resource. The search query suggestions include terms of the query triggers, expansion terms of the query triggers, and search query suggestions generated from templates applied to the terms of the query triggers and expansion terms of the query triggers. The search query suggestions are presented at the client device according to the rank of the corresponding query triggers.
    Type: Application
    Filed: August 4, 2009
    Publication date: May 24, 2012
    Applicant: Google Inc.
    Inventors: Youlin Li, Goang-Tay Hsu, Linda Lin Lin