Based On Term Frequency Of Appearance Patents (Class 707/750)

User interface methods and systems for selecting and presenting content based on user navigation and selection actions associated with the content

Patent number: 8086602

Abstract: A user-interface method of selecting and presenting a collection of content items based on user navigation and selection actions associated with the content is provided. The method includes associating a relevance weight on a per user basis with content items to indicate a relative measure of likelihood that the user desires the content item. The method includes receiving a user's navigation and selections actions for identifying desired content items, and in response, adjusting the associated relevance weight of the selected content item and group of content items containing the selected item. The method includes, in response to subsequent user input, selecting and presenting a subset of content items and content groups to the user ordered by the adjusted associated relevance weights assigned to the content items and content groups.

Type: Grant

Filed: February 24, 2011

Date of Patent: December 27, 2011

Assignee: Veveo Inc.

Inventors: Murali Aravamudan, Kajamalai G. Ramakrishnan, Rakesh Barve, Sashikumar Venkataraman, Ajit Rajasekharan
Tag suggestions based on item metadata

Patent number: 8086504

Abstract: Tag suggestions enable a hosting entity such as a website to determine one or more tags to suggest to a user for association with a particular item within an electronic catalog. After this determination, the hosting entity may suggest the determined tags to the user. To determine these tags, the hosting entity may employ techniques to determine items related to the particular item. The hosting entity then suggests some or all of the tags associated with the related items. Additionally or alternatively, the hosting entity may determine certain metadata associated with the particular item. The entity then may suggest this metadata, or some related phrase or tag, to the user for association with the particular item. However the tag suggestions are determined, the hosting entity may rank the tag suggestions to determine which tags to present to the user or to determine an order in which to present the tags.

Type: Grant

Filed: September 6, 2007

Date of Patent: December 27, 2011

Assignee: Amazon Technologies, Inc.

Inventors: Russell A. Dicker, Waqas Ahmed, Aaron D. Wilson, Scott Allen Mongrain, Florin V. Manolache, Valentin Radu Munteanu, Val Dan Dar Ion I. Rosca, Corneliu Gabriel Alexandru Rudeanu
SYSTEMS AND METHODS FOR ANALYZING PATENT RELATED DOCUMENTS

Publication number: 20110307499

Abstract: Methods and systems are disclosed that analyze patent-related documents having at least one property type. In one implementation, a method involves displaying, in a first graphical element, identifiers of the patent-related documents. The method also involves analyzing the patent-related documents to determine at least one property value for the property type. The property value includes a string of one or more words describing subject matter associated with the patent-related documents and occurring in a subset of the patent-related documents. The method also displays a second graphical element associated with the property type. The second graphical element includes the property value. The method receives, at the second graphical element, a user selection of the property value. The method displays, in the first graphical element, identifiers of the subset of the patent-related documents in which the property value occurs.

Type: Application

Filed: June 11, 2010

Publication date: December 15, 2011

Inventors: Brian K. ELIAS, Matthew C. Morrise
Detecting spam documents in a phrase based information retrieval system

Patent number: 8078629

Abstract: An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. A spam document is identified based on the number of related phrases included in a document.

Type: Grant

Filed: October 13, 2009

Date of Patent: December 13, 2011

Assignee: Google Inc.

Inventor: Anna Lynn Patterson
Methods and systems for improving text segmentation

Patent number: 8078633

Abstract: Methods and systems for improving text segmentation are disclosed. In one embodiment, at least a first segmented result and a second segmented result are determined from a string of characters, a first frequency of occurrence for the first segmented result and a second frequency of occurrence for the second segmented result are determined, and an operable segmented result is identified from the first segmented result and the second segmented result based at least in part on the first frequency of occurrence and the second frequency of occurrence.

Type: Grant

Filed: March 15, 2010

Date of Patent: December 13, 2011

Assignee: Google Inc.

Inventors: Gilad Israel Elbaz, Jacob L. Mandelson
DOCUMENT RANKING SYSTEM AND METHOD BASED ON CONTRIBUTION SCORING

Publication number: 20110302176

Abstract: Disclosed are a document ranking system and method based on contribution scoring. The document ranking system includes: a content score calculating unit for calculating content scores for documents with respect to at least one word contained in the documents, with regard to each such word; a contribution score calculating unit for calculating contribution scores for the documents with respect to jointly occurring words; and a ranking unit for ranking the documents with respect to the at least one word, with regard to each such word, by using the content scores and the contribution scores.

Type: Application

Filed: December 15, 2009

Publication date: December 8, 2011

Applicant: NHN CORPORATION

Inventors: Dong Jin Kim, Sang-Wook Kim
Systems and methods of building and using custom word lists

Patent number: 8073835

Abstract: Standard word lists that are often used for such operations as predictive text, spell checking, and word completion are based on general linguistic data that might not accurately reflect actual text usage patterns of particular users. Systems and methods of building and using a custom word list for use in text operations on an electronic device are provided. A collection of text items associated with a user of the electronic device is scanned to identify words in the text items. A weighting is then assigned to each identified word, and the words and corresponding weightings are stored.

Type: Grant

Filed: January 4, 2010

Date of Patent: December 6, 2011

Assignee: Research In Motion Limited

Inventors: Robert J. Lowles, Jason T. Griffin, Michael S. Brown
Grouping event notifications in a database system

Patent number: 8065365

Abstract: Techniques for grouping events in a computing system are provided. A registrant sends, to a database server, a request to register to receive a single notification based the occurrence of multiple events that satisfy certain criteria, referred to as grouping attributes. Such registrations are referred to as grouping registrations. An eventing mechanism in the database server receives and maintains grouping registrations. When an event is received, the eventing mechanism determines whether the event has been registered for in an active grouping registration, i.e., one whose start time has passed but whose completion criteria are not yet satisfied. If so, then the eventing mechanism updates grouping data associated with the grouping registration. When the completion criteria of a grouping registration are satisfied, the eventing mechanism sends a notification to the registrant and/or other intended recipient(s).

Type: Grant

Filed: May 2, 2007

Date of Patent: November 22, 2011

Assignee: Oracle International Corporation

Inventors: Abhishek Saxena, Neerja Bhatt
Parsing, analysis and scoring of document content

Patent number: 8065307

Abstract: The present invention may be used to analyze subject content, search and analyze reference content, compare the subject and reference content for similarity, and output comparison reports between the subject and reference content. The present invention may incorporate and utilize text from intrinsic and/or extrinsic subject documents. The analysis may employ a variety of metrics, including scores generated from a natural language processing system, scores based on classification similarity, scores based on proximity similarity, and in the case of analysis of patent documents, scores based on measurement of claims.

Type: Grant

Filed: December 20, 2006

Date of Patent: November 22, 2011

Assignee: Microsoft Corporation

Inventors: Brian Dean Haslam, Patrick Wayne John Evans, Arul Menezes, Patrick Santos
Information providing system and information providing method

Patent number: 8065289

Abstract: According to an aspect of an embodiment, a method comprises editing information related to a part according to a user operation, extracting characteristic information representing a characteristic of the part from information of an object to be edited when an operation to select the part is performed, searching a database for information similar to the characteristic information, searching the database for knowledge information related to the characteristic information, and displaying the knowledge information on a display unit.

Type: Grant

Filed: March 4, 2008

Date of Patent: November 22, 2011

Assignee: Fujitsu Limited

Inventors: Yukihiko Furumoto, Osamu Takizawa
Compressing Short Text Messages

Publication number: 20110276576

Abstract: A method of compressing short text messages, comprising: generating an index code comprising an association of keywords in the text messages with indices, the index code is logically divided into segments of variable size, each segment comprising at least one bucket, being a constant range of indices; adjusting the index code according to a natural keyword frequency distribution and to statistical analysis of the text messages; associating short indices with frequent keywords in the text messages; converting the text messages into compressed text messages in which at least some of the keywords are replaced by the associated indices; and updating the association between the indices and the keywords, updating the segments, and updating the updating frequency in respect to a usage keyword frequency distribution and temporal changes thereof

Type: Application

Filed: May 5, 2010

Publication date: November 10, 2011

Inventor: Mimran David
SYSTEM AND METHOD FOR MODELLING AND PROFILING IN MULTIPLE LANGUAGES

Publication number: 20110276577

Abstract: A system and method for generating feature vectors of documents in different languages are provided. The feature vectors provide scores associated with keywords defined in a base language for use by a profiler for generating or updating a user profile. The system and method use a plurality of keyword sets comprising: a base language keyword set comprising a plurality of base language keywords each associated with a respective identifier (ID); and a second language keyword set comprising a plurality of second language keywords each corresponding in meaning to a respective one of the base language keywords and associated with the ID of the corresponding base language keyword. One of a plurality of tokenizers is selected to parse a document based on the language of the document and to generate the feature vector using the keyword set of the corresponding language.

Type: Application

Filed: July 23, 2010

Publication date: November 10, 2011

Applicant: KINDSIGHT, INC.

Inventors: Hong Yao, Wu Wang, Mei Marker, Kelvin Edmison, Wei Wang
Systems and methods for measuring behavior characteristics

Patent number: 8055663

Abstract: Systems and methods for measuring behavior characteristics. For at least one specific user, a first concern score for respective key terms is calculated according to use frequency of respective key terms of network content corresponding to the specific user and all users. A first relation matrix for at least one specific key term is calculated according to at least two users corresponding to respective interaction behaviors between the key terms and a type weighting corresponding to respective interaction behaviors. A first interaction score for the specific user regarding the specific key term is calculated according to the first relation matrix. A first characteristic score for the specific user regarding the specific key term is calculated according to the first concern score and the first interaction score.

Type: Grant

Filed: December 20, 2006

Date of Patent: November 8, 2011

Assignee: Institute for Information Industry

Inventors: Tse-Ming Tsai, Chia-Chun Shih
ESTABLISHING SEARCH RESULTS AND DEEPLINKS USING TRAILS

Publication number: 20110264673

Abstract: Search and browse trails are temporally-ordered sequences of web pages visited by a user during post-search query navigation beginning with a page associated with one of the search results. The trails can provide useful information for a number of search-related purposes. For example, these trails can be used to leverage the post-query behavior of other users to help the current user search more effectively and allow them to make more informed search interaction decisions. The trails can also be used to establish search results and refine search result rankings, select and evaluate deeplinks, and recommend multi-step trails as an alternative to or enhancement for existing search result presentation techniques.

Type: Application

Filed: April 27, 2010

Publication date: October 27, 2011

Applicant: Microsoft Corporation

Inventors: Ryen W. White, Peter Bailey, Nikhil Dandekar, Adish Singla, Jeff Huang
Duplicate entry detection system and method

Patent number: 8046372

Abstract: A computer system and method for determining whether the subject matter described in a received document is substantially similar to the subject matter of other documents in a document corpus, such that the received document can be considered a duplicate document. After receiving a first document, a set of tokens for the first document is generated. A non-fielded relevance search on a token index is executed. The relevance search returns a set of candidate duplicate documents with scores corresponding to each candidate document. For each candidate document with a score above a threshold, filtering is performed on each candidate document to determine whether each candidate document is a true duplicate of the first document. A set of candidate documents with a score above the threshold that were not disqualified as candidate documents is then provided.

Type: Grant

Filed: May 25, 2007

Date of Patent: October 25, 2011

Assignee: Amazon Technologies, Inc.

Inventors: Srikanth Thirumalai, Aswath Manoharan, Mark J. Tomko, Grant M. Emery, Vijai Mohan, Egidio Terra
METHOD AND SYSTEM OF CONTENT RECOMMENDATION

Publication number: 20110258196

Abstract: A method of content recommendation, includes: generating a first digital mathematical representation of contents to associate the contents with a first plurality of words describing the contents; generating a second digital mathematical representation of text documents different from the contents to associate the documents with a second plurality of words; processing the first and second pluralities of words to determine a common plurality of words; processing the first and second digital mathematical representations to generate a common digital mathematical representation of the contents and the text documents based on the common plurality of words; and providing content recommendation by processing the common digital mathematical representation.

Type: Application

Filed: December 30, 2008

Publication date: October 20, 2011

Inventors: Skjalg Lepsoy, Gianluca Francini, Fabrizio Antonelli
Content item retrieval based on a free text entry

Patent number: 8041700

Abstract: A method and apparatus for textual searching of a database is provided herein. During operation a user will input a letter into a search engine. The search engine will score words based on the letter and display results of the highest-scored words. Another letter will again be received and the process repeated. In situations where titles are returned to the user, additional steps of associating the words with a title and scoring the title take place. The highest-scored titles are provided to the user as the displayed results.

Type: Grant

Filed: April 7, 2009

Date of Patent: October 18, 2011

Assignee: Motorola Mobility, Inc.

Inventor: Changxue Ma
LARGE SCALE CONCEPT DISCOVERY FOR WEBPAGE AUGMENTATION USING SEARCH ENGINE INDEXERS

Publication number: 20110252045

Abstract: Disclosed is a method and system for retrieving data; extracting information from the data; learning to disambiguate the extracted information such that a particular sense of each phrase within the extracted information is determined; generating a disambiguation classifier from the learning to disambiguate step, the disambiguation classifier configured to determine a sense of a phrase within a document; learning to select a portion of the information as being relevant to a theme of the data; generating a selection classifier from the learning to select step, the selection classifier configured to select a topic in a document that is relevant to a theme of the document; and using the disambiguation classifier and the selection classifier by an indexing computer to determine a set of topics from a web document retrieved by the indexing computer.

Type: Application

Filed: April 7, 2010

Publication date: October 13, 2011

Applicant: Yahoo! Inc.

Inventors: Priyank Shanker Garg, Rohan Monga, Hemanth Sambrani, Sudharsan Vasudevan
Web site search and selection method

Patent number: 8037048

Abstract: According to the web site search and selection method, in response to a search query a relevance score is assigned to each page of the web sites addressed by the search engine. Then, for each web site addressed by the search engine, the relevance scores of the individual pages are added together, after weighting them by a correction factor indicative at least of the number of pages of the site itself. In this manner, in response to the search query an overall relevance value for the sites addressed by the search engine is obtained.

Type: Grant

Filed: February 11, 2008

Date of Patent: October 11, 2011

Assignee: Web Lion S.A.S. di Panarese Marco & Co.

Inventor: Marco Panarese
Methods and Systems for Extracting Domain Phrases

Publication number: 20110246486

Abstract: Methods and systems for extracting domain phrases are provided. First, a domain phrase database including a plurality of domain phrases is provided. For a candidate phrase, it is determined whether the candidate phrase is a domain phrase according to an occurrence condition of at least one part of the candidate phrase in the domain phrases of the domain phrase database and the occurrence condition of the at least one part of the candidate phrase at different relative positions in respective domain phrases in respective domain phrases.

Type: Application

Filed: October 7, 2010

Publication date: October 6, 2011

Applicant: INSTITUTE FOR INFORMATION INDUSTRY

Inventors: Ting-Chun Peng, Chia-Chun Shih, Wen-Tai Hsieh
Using message sampling to determine the most frequent words in a user mailbox

Patent number: 8032537

Abstract: A method is presented for generating a list of frequently used words for an email application on a server computer. When a request is received for a word frequency list for emails stored in a user's mailbox, a word frequency list is returned if one exists. If the word frequency list does not exist, an asynchronous process is started on the server computer to generate a word frequency list. If the word frequency list exists but it is older than an aging limit, an asynchronous process is started on the server computer to regenerate the word frequency list. The word frequency list is stored in the user's mailbox along with a timestamp indicating the date and time that the list was created or updated.

Type: Grant

Filed: December 10, 2008

Date of Patent: October 4, 2011

Assignee: Microsoft Corporation

Inventors: Ashish Consul, Suryanarayana M. Gorti, Michael Geoffrey Andrew Wilson, James C. Kleewein
Vector space method for secure information sharing

Patent number: 8024344

Abstract: Presented are systems and methods for securely sharing confidential information. In such a method, term vectors corresponding to ones of a plurality of confidential terms included in a plurality of confidential documents is received. Each of the received term vectors is mapped into a vector space. Non-confidential documents are mapped into the vector space to generate a document vector corresponding to each non-confidential document, wherein the generation of each document vector is based on a subset of the received term vectors. At least one of the non-confidential documents is identified in response to a query mapped into the vector space.

Type: Grant

Filed: June 5, 2008

Date of Patent: September 20, 2011

Assignee: Content Analyst Company, LLC

Inventor: Roger Bradford
Systems and methods for determining a tag match ratio

Patent number: 8024342

Abstract: The present invention is directed towards systems and methods for determining a tag match ratio. The method according to one embodiment of the present invention comprises selecting a content item, identifying one or more tags that are associated with the content item and determining a weight for each of the one or more tags associated with the content item. The method further comprises extracting one or more keywords from the content item. A tag match ratio for the one or more tags associated with the content item is then calculated and stored.

Type: Grant

Filed: July 31, 2008

Date of Patent: September 20, 2011

Assignee: Yahoo! Inc.

Inventors: Xin Li, Lei Guo, Eric Zhao
SYSTEM AND METHOD FOR GUIDING ENTITY-BASED SEARCHING

Publication number: 20110225155

Abstract: A system and method are provided for refining a user's query. An entity index, generated from a corpus of text documents, is provided. The entity index includes a set of entity structures, each including a plurality of terms. Each of the terms of an entity structure is a feature of the same entity. Entity structures can be retrieved from the entity index which match at least a portion of the user's query. Clusters of the retrieved entity structures are identified which have at least one of their terms in common. A cluster hierarchy is generated from the identified clusters in which nodes of the hierarchy are defined by one or more of the terms of the retrieved entity structures. At least a portion of the cluster hierarchy is presented to the user for facilitating refinement of the user's query through user selection of a node which, when formulated as a search, retrieves one or more responsive documents from the corpus of documents.

Type: Application

Filed: March 10, 2010

Publication date: September 15, 2011

Applicant: Xerox Corporation

Inventors: Frederic Roulland, Stefania Castellani, Antonietta Grasso, Caroline Brun
MEDIA VALUE ENGINE

Publication number: 20110225174

Abstract: Exemplary embodiments are directed to determining a media value associated mentions of an entity in one or more documents based on a sentiment attributed to the mentions of the entity and/or a frequency with which the entity is mentioned. Exemplary embodiments can include a media value engine that can identify mentions of an entity in documents, attribute sentiment to the mentions of the entity; determine a polarity of the sentiment, and calculate a media value attributed to the entity based on the sentiment.

Type: Application

Filed: March 14, 2011

Publication date: September 15, 2011

Inventors: Greg Artzt, Mark Fasciano, Steve Skiena, Levon Lloyd
DETECTING DUPLICATES IN A SHARED KNOWLEDGE BASE

Publication number: 20110219013

Abstract: Methods and systems supporting curation of items in a searchable knowledge base are provided. The methods and systems include mining one or more search queries of the searchable knowledge base, where each of the search queries includes a plurality of the items. The method further includes determining one or more pairs of items using a processor, where each of the pairs of items includes a correlation value exceeding a threshold. The correlation values for the pairs of items are based upon the frequency the items of the pairs of items co-occur within the search queries. The method further includes providing the pairs of items to a curator, where the curator reviews the pairs of items.

Type: Application

Filed: March 5, 2010

Publication date: September 8, 2011

Applicant: PALO ALTO RESEARCH CENTER INCORPORATED

Inventor: John T. Maxwell, III
SEARCH APPARATUS, SEARCH METHOD, AND RECORDING MEDIUM STORING PROGRAM

Publication number: 20110219000

Abstract: Provided is a search apparatus, a search method, and a program that can improve search speed for a document set even when an object to be searched is a large-scale document set.

Type: Application

Filed: November 6, 2009

Publication date: September 8, 2011

Inventor: Yukitaka Kusumura
KEYWORD AUTOMATION OF VIDEO CONTENT

Publication number: 20110218994

Abstract: A system and associated method for automatically processing keyword for video content. The video content contains image frames and an audio stream. An image pattern table for image patterns from the image frames and a word pattern table for word patterns from the audio stream are generated by use of respective pattern names provided by pattern recognition tools. Each pattern is associated with a respective count indicating a number of appearances of each pattern. A respective weight of each pattern is calculated as a relative frequency of each pattern. The image pattern table and the word pattern table are merged to generate a keyword list. A predefined number of most frequently appeared patterns are selected by examining the respective weight of each pattern and metadata associated with the video content are updated to utilize pattern names of the selected patterns as keyword for web searches.

Type: Application

Filed: March 5, 2010

Publication date: September 8, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Christopher E. Holladay, William P. Shaouy
Method and system for accessing a file system

Patent number: 8015193

Abstract: A method for accessing a file system including computing a first numerical similarity score for a first stored document and a second numerical similarity score for a second stored document by comparing a plurality of weighted active terms with a plurality of weighted indexed terms, determining a document order of the first stored document followed by the second stored document based on the first numerical similarity score exceeding the second numerical similarity score, generating a list of similar documents including the first stored document followed by the second stored document based on the document order, and displaying, in a file system interface and on the computer display, the list of similar documents while an active document is open in an active document interface.

Type: Grant

Filed: December 14, 2010

Date of Patent: September 6, 2011

Assignee: Oracle America, Inc.

Inventors: Stephen J. Green, Jeffrey L. Alexander, Paul B. Lamere
System and method for thematically grouping documents into clusters

Patent number: 8015188

Abstract: A system and method for thematically grouping documents into clusters is provided. Concepts are extracted from a plurality of documents. The concepts include nouns or noun phrases. A number of occurrences for each concept are determined within each document. A bounded range is applied to the concepts and a subset of the concepts is selected by removing the concepts that fall outside the bounded range. The bounded range includes upper edge conditions and lower edge conditions. Themes are generated from the subset of concepts by identifying two or more concepts with common semantic meaning. Clusters of the documents are generated based on the themes.

Type: Grant

Filed: October 4, 2010

Date of Patent: September 6, 2011

Assignee: FTI Technology LLC

Inventors: Dan Gallivan, Kenji Kawai
System and method for determining a relationship between available content and current interests to identify a need for content

Patent number: 8010529

Abstract: A system and method for comparing search queries provided by a user with content items available in an index. Search queries are received and stored in a database query log. Content items are located on a network and stored in an index. A value is generated for concepts and categories in the query log and the index. The value for different categories and concepts in the query log is compared with the value for different categories and concepts in the index. A need for content is determined for a given concept and category, which may be communicated to content providers, e.g., web site operators.

Type: Grant

Filed: October 23, 2006

Date of Patent: August 30, 2011

Assignee: Yahoo! Inc.

Inventor: Shyam Kapur
Phrase based snippet generation

Patent number: 8010539

Abstract: Disclosed herein is a method, a system and a computer product for generating a snippet for an entity, wherein each snippet comprises a plurality of sentiments about the entity. One or more textual reviews associated with the entity is selected. A plurality of sentiment phrases are identified based on the one or more textual reviews, wherein each sentiment phrase comprises a sentiment about the entity. One or more sentiment phrases from the plurality of sentiment phrases are selected to generate a snippet.

Type: Grant

Filed: January 25, 2008

Date of Patent: August 30, 2011

Assignee: Google Inc.

Inventors: Sasha Blair-Goldensohn, Kerry Hannan, Ryan McDonald, Tyler Neylon, Jeffrey C. Reynar
Organization of Data Within a Database

Publication number: 20110208754

Abstract: A computer implemented method is provided for processing data representing a data entity having sub entities. The method includes analyzing queries to the data entity for deriving information about sets of the sub entities frequently queried together, and grouping the sub entities to a number of banks, each bank having a maximum width, based on the information about sets of sub entities frequently queried together, in order to reduce an average number of banks to be accessed for data retrieval.

Type: Application

Filed: November 15, 2010

Publication date: August 25, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Tianchao Li, Peter Bendel, Oliver Draese, Namik Hrle
Web object retrieval based on a language model

Patent number: 8001130

Abstract: A method and system is provided for determining relevance of an object to a term based on a language model. The relevance system provides records extracted from web pages that relate to the object. To determine the relevance of the object to a term, the relevance system first determines, for each record of the object, a probability of generating that term using a language model of the record of that object. The relevance system then calculates the relevance of the object to the term by combining the probabilities. The relevance system may also weight the probabilities based on the accuracy or reliability of the extracted information for each data source.

Type: Grant

Filed: July 25, 2006

Date of Patent: August 16, 2011

Assignee: Microsoft Corporation

Inventors: Ji-Rong Wen, Shuming Shi, Wei-Ying Ma, Yunxiao Ma, Zaiqing Nie
Method and system for ranking of keywords for profitability

Patent number: 8001131

Abstract: A physical computing device receives information regarding a total number of people who are searching on the search term. Information is received regarding an amount advertisers pay for the search term. Information is received regarding a click through rate of the search term. A traffic estimate of the search term is determined. Longevity of the search term is determined.

Type: Grant

Filed: December 17, 2008

Date of Patent: August 16, 2011

Assignee: Demand Media Inc.

Inventor: Byron William Reese
Word pluralization handling in query for web search

Patent number: 7996410

Abstract: Techniques for determining when and how to transform words in a query to its plural or non-plural form in order to provide the most relevant search results while minimizing computational overhead are provided. A dictionary is generated based upon the words used in a specified number of previous most frequent search queries and comprises lists of transformations from plural to singular and singular to plural. Unnecessary transformations are removed from the dictionary based upon language modeling. The word to transform is determined by finding the last non-stop re-writable word of the query. The context of the transformed word is confirmed in the search documents and a version of the query is executed using both the original form of the word and the transformation of the word.

Type: Grant

Filed: February 1, 2007

Date of Patent: August 9, 2011

Assignee: Yahoo! Inc.

Inventors: Fuchun Peng, Nawaaz Ahmed, Xin Li, Yumao Lu
METHOD FOR MONITORING ABNORMAL STATE OF INTERNET INFORMATION

Publication number: 20110191355

Abstract: A method for monitoring abnormal state of Internet information by monitoring the change of hot words frequency in the Internet information. The method includes the following steps: 1) obtaining the current date word frequency data of common words appearing in the current date web pages; 2) combining with the hot words dictionary that the user focuses on to determine the current date keywords set of the Internet information; 3) determining the weight of each current date keyword; 4) determining the abnormal threshold of the current date keywords; 5) detecting the abnormal level of the current date keywords to determine the current date hot Internet information. The present invention calculates the abnormal level of keywords by monitoring the change of hot words frequency in the Internet information, predicts and gives alarm for the abnormal level of hot words frequency change, which makes the Internet information user react at the first moment.

Type: Application

Filed: April 24, 2008

Publication date: August 4, 2011

Applicant: PEKING UNIVERSITY

Inventors: Xun Liang, Hua Chen, Jian Yang
Global checkpoint SCN

Patent number: 7991775

Abstract: Described herein are techniques for generating a global checkpoint system change number and computing a snapshot query using the global checkpoint system change number without a need to acquire global locks. In many cases, the need to acquire global locks is eliminated, thereby saving the overhead attendant to processing global locks.

Type: Grant

Filed: October 2, 2008

Date of Patent: August 2, 2011

Assignee: Oracle International Corporation

Inventors: Neil MacNaughton, Tirthankar Lahiri, Varun Malhotra
Methods and apparatuses for searching content

Patent number: 7987169

Abstract: Embodiments of methods and apparatuses for searching contents, including structured search are described herein. Embodiments of the present invention use tree structures (or more generally, graph structures), layout structures, and/or content category information to capture within search results relevant content that would otherwise be missed, to reduce the incidence of false positives within search results, and to improve the accuracy of rankings within search results. Embodiments of the present invention further use tree structures (or more generally, graph structures), layout structures, and/or content category information to extend search results to include sub-document constituents. Embodiments of the present invention also support the use of distribution properties as criteria for ranking search results.

Type: Grant

Filed: June 12, 2007

Date of Patent: July 26, 2011

Assignee: Zalag Corporation

Inventor: Samuel S. Epstein
System and method for generating a relationship network

Patent number: 7987191

Abstract: A computer-implemented system and process for generating a relationship network is disclosed. The system provides a set of data items to be related and generates variable length data vectors to represent the relationships between the terms within each data item. The system can be used to generate a relationship network for documents, images, or any other type of file. This relationship network can then be queried to discover the relationships between terms within the set of data items.

Type: Grant

Filed: November 27, 2007

Date of Patent: July 26, 2011

Assignee: The Regents of The University of California

Inventors: Kasian Franks, Cornelia A. Myers, Raf M. Podowski
Scalable model-based product matching

Patent number: 7979459

Abstract: Aspects of the subject matter described herein relate to matching product information to products. In aspects, a product matching component receives product information. The product matching component normalizes the product information and obtains keywords from the product information. By querying a database of recognized products, the keywords are used to obtain a list of products that potentially match the product information. A confidence level is assigned to each of the potential matches in the list. A match may be returned for the highest matched product or for a selectable number of products whose confidence level(s) exceed a selectable threshold.

Type: Grant

Filed: June 15, 2007

Date of Patent: July 12, 2011

Assignee: Microsoft Corporation

Inventors: Kai Wu, Daniel Takacs, Tong Yao, Jiyu Zhang, Hua Yang, Ji-Rong Wen, Jonathan R M Hart, Eric Anthony Reel
Automatic generation of embedded signatures for duplicate detection on a public network

Patent number: 7979413

Abstract: In accordance with an aspect of the invention, a method and system are disclosed for constructing an embedded signature in order to facilitate post-facto detection of leakage of sensitive data. The leakage detection mechanism involves: 1) identifying at least one set of words in an electronic document containing sensitive data, the set of words having a low frequency of occurrence in a first collection of electronic documents; and, 2) transmitting a query to search a second collection of electronic documents for any electronic document that contains the set of words having a low frequency of occurrence.

Type: Grant

Filed: May 30, 2008

Date of Patent: July 12, 2011

Assignees: AT&T Intellectual Property l, L.P., New York University

Inventors: Balachander Krishnamurthy, Saurabh Kumar, Lakshminarayanan Subramanian
Method, system, and storage medium for providing web-based electronic research and presentation functions via a document creation application

Patent number: 7970774

Abstract: An exemplary embodiment of the invention relates to a method, system, and storage medium for providing web-based electronic research and presentation functions via a document creation application. The method includes scanning an active document on a computer to identify relevant keywords, searching a database for reference materials relating to the relevant keywords, and displaying relevant reference materials on the computer. The method further includes deploying process software for providing the web-based electronic research and presentation functions via a document creation application. The deployment includes installing the process software on a server, identifying server addresses for users accessing the process software on the server, sending the process software to the server and copying the process software to a file system of the server. The deployment also includes sending the process software to a client computer and executing the process software on the client computer.

Type: Grant

Filed: April 18, 2008

Date of Patent: June 28, 2011

Assignee: International Business Machines Corporation

Inventors: Edward E. Kelley, Tijs Y. Wilbrink, Ellis Zijlstra
Method, system, and storage medium for providing web-based electronic research and presentation functions via a document creation application

Patent number: 7970775

Abstract: An exemplary embodiment of the invention relates to a method and storage medium for providing web-based electronic research and presentation functions via a document creation application. The method includes scanning an active document on a computer to identify relevant keywords, searching a database for reference materials relating to the relevant keywords, and displaying relevant reference materials on the computer. The method further includes on-demand sharing of process software for providing the electronic research and presentation functions via the document creation application. The on-demand sharing includes creating a transaction containing unique customer identification, requested service type, and service parameters; sending the transaction to a server; querying the server about processing capacity associated with the server to help ensure availability of adequate resources for processing the transaction; and allocating additional processing capacity when needed to process the transaction.

Type: Grant

Filed: April 18, 2008

Date of Patent: June 28, 2011

Assignee: International Business Machines Corporation

Inventors: Edward E. Kelley, Tijs Y. Wilbrink, Ellis Zijlstra
SYSTEM AND METHOD FOR PROVIDING VECTOR TERMS RELATED TO INSTANT MESSAGING CONVERSATIONS

Publication number: 20110153585

Abstract: The method according to one embodiment of the present invention comprises retrieving one or more terms or phrases comprising an instant messaging conversation in which one or more users are participating. One or more term vectors comprising one or more vector terms associated with the one or more retrieved terms or phrases comprising the instant messaging conversation are generated and one or more vector terms are selected from said term vectors. The one or more selected vector terms are displayed to the one or more users participating in the instant messaging conversation. An indication of a user selection of a given displayed vector term is received and one or more content items responsive to the selected vector term are identified.

Type: Application

Filed: February 24, 2011

Publication date: June 23, 2011

Applicant: YAHOO! INC.

Inventor: Shiv Ramamurthi
Using a weighted tree to determine document relevance

Patent number: 7962480

Abstract: The relevance of documents is automatically determined based upon a weighted tree. Terms considered to be relevant are assigned to the leaf nodes of a tree data structure. A location can also be specified in a leaf node, indicating where in a document the term must appear to be considered relevant. Internal nodes of the tree are assigned operators (e.g., add, maximum or minimum). The connections between nodes are assigned weights. A relevance value for a given document is calculated as a function of occurrence in the document of terms assigned to leaves, operators assigned to internal nodes, and weights assigned to the associated node connections. Weighted trees can be used to process search queries. Documents with high relevance scores calculated against the tree can be returned to a user as the results to a query.

Type: Grant

Filed: July 31, 2007

Date of Patent: June 14, 2011

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Li Zhang, Yuhong Xiong, Shicong Feng, Yong Zhao
SYSTEMS AND METHODS FOR DETECTING SENTIMENT-BASED TOPICS

Publication number: 20110137906

Abstract: A method for analyzing sentiment comprising: collecting an object from an external content repository, the collected objects forming a content database; extracting a snippet related to the subject from the content database; calculating a sentiment score for the snippet; classifying the snippet into a sentiment category; creating sentiment taxonomy using the sentiment categories, the sentiment taxonomy classifying the snippets as positive, negative or neutral; identifying topic words within the sentiment taxonomy; classifying the topic words as a sentiment topic word candidates or a non-sentiment topic word candidate, filtering the non-sentiment topic word candidates; identifying the frequency of the non-sentiment topic words in each of the sentiment categories; identifying the importance of the non-sentiment topic word for each of the sentiment categories; and, ranking the topic word, wherein the rank is calculated by combining the frequency of the topic words in each of the categories with its importance.

Type: Application

Filed: December 9, 2009

Publication date: June 9, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES, INC.

Inventors: Keke Cai, Ying Chen, W. Scott Spangler, LI Zhang
Clustering aggregator for RSS feeds

Patent number: 7958125

Abstract: A method for merging really simple syndication (RSS) feeds. Stories containing one or more terms may be merged into one or more clusters based on one or more links between the stories. A cluster frequency with which the terms occur in each cluster may be determined. A diameter for each cluster may be determined. A cluster that is most similar to one of the clusters may be determined based on the cluster frequency. The most similar cluster with the one of the clusters may be determined based on each diameter, and each cluster frequency.

Type: Grant

Filed: June 26, 2008

Date of Patent: June 7, 2011

Assignee: Microsoft Corporation

Inventors: Jun Yan, Ning Liu, Lei Ji, Zheng Chen, Jian Wang
Enhanced project predictor

Patent number: 7949663

Abstract: A computer implemented system for project prediction is provided. The system includes a data manager to obtain historical project data. The system also includes an analyzer to analyze the historical project data and an analysis cycle time to generate models for a proposed project cycle time. Additionally, the system includes a user interface to select one model for the proposed project cycle time, wherein the selected model includes linear sub-models corresponding to a historical data range, and apply proposed project data and analysis cycle time to one linear sub-model corresponding to a proposed data range to predict the proposed project cycle time. Furthermore, the system captures proposed project data and obtains additional project data to update the selected model. The models provide for the accurate prediction of cycle times, or project costs, in an enterprise development environment.

Type: Grant

Filed: February 8, 2007

Date of Patent: May 24, 2011

Assignee: Sprint Communications Company L.P.

Inventors: Deandra T. Cassone, Joseph E. Dudley, George R. Kather, Paul R. Sapenaro, Jason N. Ward
Method and apparatus for rating user generated content in search results

Patent number: 7949643

Abstract: Generally, a method and apparatus provides for rating user generated content (UGC) with respect to search engine results. The method and apparatus includes recognizing a UGC data field collected from a web document located at a web location. The method and apparatus calculates: a document goodness factor for the web document; an author rank for an author of the UGC data field; and a location rank for web location. The method and apparatus thereby generates a rating factor for the UGC field based on the document goodness factor, the author rank and the location rank. The method and apparatus also outputs a search result that includes the UGC data field positioned in the search results based on the rating factor.

Type: Grant

Filed: April 29, 2008

Date of Patent: May 24, 2011

Assignee: Yahoo! Inc.

Inventors: Jaya Kawale, Aditya Pal

prev … 5 6 7 8 9 10 11 next