Based On Term Frequency Of Appearance Patents (Class 707/750)
  • Patent number: 8086602
    Abstract: A user-interface method of selecting and presenting a collection of content items based on user navigation and selection actions associated with the content is provided. The method includes associating a relevance weight on a per user basis with content items to indicate a relative measure of likelihood that the user desires the content item. The method includes receiving a user's navigation and selections actions for identifying desired content items, and in response, adjusting the associated relevance weight of the selected content item and group of content items containing the selected item. The method includes, in response to subsequent user input, selecting and presenting a subset of content items and content groups to the user ordered by the adjusted associated relevance weights assigned to the content items and content groups.
    Type: Grant
    Filed: February 24, 2011
    Date of Patent: December 27, 2011
    Assignee: Veveo Inc.
    Inventors: Murali Aravamudan, Kajamalai G. Ramakrishnan, Rakesh Barve, Sashikumar Venkataraman, Ajit Rajasekharan
  • Patent number: 8086504
    Abstract: Tag suggestions enable a hosting entity such as a website to determine one or more tags to suggest to a user for association with a particular item within an electronic catalog. After this determination, the hosting entity may suggest the determined tags to the user. To determine these tags, the hosting entity may employ techniques to determine items related to the particular item. The hosting entity then suggests some or all of the tags associated with the related items. Additionally or alternatively, the hosting entity may determine certain metadata associated with the particular item. The entity then may suggest this metadata, or some related phrase or tag, to the user for association with the particular item. However the tag suggestions are determined, the hosting entity may rank the tag suggestions to determine which tags to present to the user or to determine an order in which to present the tags.
    Type: Grant
    Filed: September 6, 2007
    Date of Patent: December 27, 2011
    Assignee: Amazon Technologies, Inc.
    Inventors: Russell A. Dicker, Waqas Ahmed, Aaron D. Wilson, Scott Allen Mongrain, Florin V. Manolache, Valentin Radu Munteanu, Val Dan Dar Ion I. Rosca, Corneliu Gabriel Alexandru Rudeanu
  • Publication number: 20110307499
    Abstract: Methods and systems are disclosed that analyze patent-related documents having at least one property type. In one implementation, a method involves displaying, in a first graphical element, identifiers of the patent-related documents. The method also involves analyzing the patent-related documents to determine at least one property value for the property type. The property value includes a string of one or more words describing subject matter associated with the patent-related documents and occurring in a subset of the patent-related documents. The method also displays a second graphical element associated with the property type. The second graphical element includes the property value. The method receives, at the second graphical element, a user selection of the property value. The method displays, in the first graphical element, identifiers of the subset of the patent-related documents in which the property value occurs.
    Type: Application
    Filed: June 11, 2010
    Publication date: December 15, 2011
    Inventors: Brian K. ELIAS, Matthew C. Morrise
  • Patent number: 8078629
    Abstract: An information retrieval system uses phrases to index, retrieve, organize and describe documents. Phrases are identified that predict the presence of other phrases in documents. Documents are the indexed according to their included phrases. A spam document is identified based on the number of related phrases included in a document.
    Type: Grant
    Filed: October 13, 2009
    Date of Patent: December 13, 2011
    Assignee: Google Inc.
    Inventor: Anna Lynn Patterson
  • Patent number: 8078633
    Abstract: Methods and systems for improving text segmentation are disclosed. In one embodiment, at least a first segmented result and a second segmented result are determined from a string of characters, a first frequency of occurrence for the first segmented result and a second frequency of occurrence for the second segmented result are determined, and an operable segmented result is identified from the first segmented result and the second segmented result based at least in part on the first frequency of occurrence and the second frequency of occurrence.
    Type: Grant
    Filed: March 15, 2010
    Date of Patent: December 13, 2011
    Assignee: Google Inc.
    Inventors: Gilad Israel Elbaz, Jacob L. Mandelson
  • Publication number: 20110302176
    Abstract: Disclosed are a document ranking system and method based on contribution scoring. The document ranking system includes: a content score calculating unit for calculating content scores for documents with respect to at least one word contained in the documents, with regard to each such word; a contribution score calculating unit for calculating contribution scores for the documents with respect to jointly occurring words; and a ranking unit for ranking the documents with respect to the at least one word, with regard to each such word, by using the content scores and the contribution scores.
    Type: Application
    Filed: December 15, 2009
    Publication date: December 8, 2011
    Applicant: NHN CORPORATION
    Inventors: Dong Jin Kim, Sang-Wook Kim
  • Patent number: 8073835
    Abstract: Standard word lists that are often used for such operations as predictive text, spell checking, and word completion are based on general linguistic data that might not accurately reflect actual text usage patterns of particular users. Systems and methods of building and using a custom word list for use in text operations on an electronic device are provided. A collection of text items associated with a user of the electronic device is scanned to identify words in the text items. A weighting is then assigned to each identified word, and the words and corresponding weightings are stored.
    Type: Grant
    Filed: January 4, 2010
    Date of Patent: December 6, 2011
    Assignee: Research In Motion Limited
    Inventors: Robert J. Lowles, Jason T. Griffin, Michael S. Brown
  • Patent number: 8065365
    Abstract: Techniques for grouping events in a computing system are provided. A registrant sends, to a database server, a request to register to receive a single notification based the occurrence of multiple events that satisfy certain criteria, referred to as grouping attributes. Such registrations are referred to as grouping registrations. An eventing mechanism in the database server receives and maintains grouping registrations. When an event is received, the eventing mechanism determines whether the event has been registered for in an active grouping registration, i.e., one whose start time has passed but whose completion criteria are not yet satisfied. If so, then the eventing mechanism updates grouping data associated with the grouping registration. When the completion criteria of a grouping registration are satisfied, the eventing mechanism sends a notification to the registrant and/or other intended recipient(s).
    Type: Grant
    Filed: May 2, 2007
    Date of Patent: November 22, 2011
    Assignee: Oracle International Corporation
    Inventors: Abhishek Saxena, Neerja Bhatt
  • Patent number: 8065307
    Abstract: The present invention may be used to analyze subject content, search and analyze reference content, compare the subject and reference content for similarity, and output comparison reports between the subject and reference content. The present invention may incorporate and utilize text from intrinsic and/or extrinsic subject documents. The analysis may employ a variety of metrics, including scores generated from a natural language processing system, scores based on classification similarity, scores based on proximity similarity, and in the case of analysis of patent documents, scores based on measurement of claims.
    Type: Grant
    Filed: December 20, 2006
    Date of Patent: November 22, 2011
    Assignee: Microsoft Corporation
    Inventors: Brian Dean Haslam, Patrick Wayne John Evans, Arul Menezes, Patrick Santos
  • Patent number: 8065289
    Abstract: According to an aspect of an embodiment, a method comprises editing information related to a part according to a user operation, extracting characteristic information representing a characteristic of the part from information of an object to be edited when an operation to select the part is performed, searching a database for information similar to the characteristic information, searching the database for knowledge information related to the characteristic information, and displaying the knowledge information on a display unit.
    Type: Grant
    Filed: March 4, 2008
    Date of Patent: November 22, 2011
    Assignee: Fujitsu Limited
    Inventors: Yukihiko Furumoto, Osamu Takizawa
  • Publication number: 20110276576
    Abstract: A method of compressing short text messages, comprising: generating an index code comprising an association of keywords in the text messages with indices, the index code is logically divided into segments of variable size, each segment comprising at least one bucket, being a constant range of indices; adjusting the index code according to a natural keyword frequency distribution and to statistical analysis of the text messages; associating short indices with frequent keywords in the text messages; converting the text messages into compressed text messages in which at least some of the keywords are replaced by the associated indices; and updating the association between the indices and the keywords, updating the segments, and updating the updating frequency in respect to a usage keyword frequency distribution and temporal changes thereof
    Type: Application
    Filed: May 5, 2010
    Publication date: November 10, 2011
    Inventor: Mimran David
  • Publication number: 20110276577
    Abstract: A system and method for generating feature vectors of documents in different languages are provided. The feature vectors provide scores associated with keywords defined in a base language for use by a profiler for generating or updating a user profile. The system and method use a plurality of keyword sets comprising: a base language keyword set comprising a plurality of base language keywords each associated with a respective identifier (ID); and a second language keyword set comprising a plurality of second language keywords each corresponding in meaning to a respective one of the base language keywords and associated with the ID of the corresponding base language keyword. One of a plurality of tokenizers is selected to parse a document based on the language of the document and to generate the feature vector using the keyword set of the corresponding language.
    Type: Application
    Filed: July 23, 2010
    Publication date: November 10, 2011
    Applicant: KINDSIGHT, INC.
    Inventors: Hong Yao, Wu Wang, Mei Marker, Kelvin Edmison, Wei Wang
  • Patent number: 8055663
    Abstract: Systems and methods for measuring behavior characteristics. For at least one specific user, a first concern score for respective key terms is calculated according to use frequency of respective key terms of network content corresponding to the specific user and all users. A first relation matrix for at least one specific key term is calculated according to at least two users corresponding to respective interaction behaviors between the key terms and a type weighting corresponding to respective interaction behaviors. A first interaction score for the specific user regarding the specific key term is calculated according to the first relation matrix. A first characteristic score for the specific user regarding the specific key term is calculated according to the first concern score and the first interaction score.
    Type: Grant
    Filed: December 20, 2006
    Date of Patent: November 8, 2011
    Assignee: Institute for Information Industry
    Inventors: Tse-Ming Tsai, Chia-Chun Shih
  • Publication number: 20110264673
    Abstract: Search and browse trails are temporally-ordered sequences of web pages visited by a user during post-search query navigation beginning with a page associated with one of the search results. The trails can provide useful information for a number of search-related purposes. For example, these trails can be used to leverage the post-query behavior of other users to help the current user search more effectively and allow them to make more informed search interaction decisions. The trails can also be used to establish search results and refine search result rankings, select and evaluate deeplinks, and recommend multi-step trails as an alternative to or enhancement for existing search result presentation techniques.
    Type: Application
    Filed: April 27, 2010
    Publication date: October 27, 2011
    Applicant: Microsoft Corporation
    Inventors: Ryen W. White, Peter Bailey, Nikhil Dandekar, Adish Singla, Jeff Huang
  • Patent number: 8046372
    Abstract: A computer system and method for determining whether the subject matter described in a received document is substantially similar to the subject matter of other documents in a document corpus, such that the received document can be considered a duplicate document. After receiving a first document, a set of tokens for the first document is generated. A non-fielded relevance search on a token index is executed. The relevance search returns a set of candidate duplicate documents with scores corresponding to each candidate document. For each candidate document with a score above a threshold, filtering is performed on each candidate document to determine whether each candidate document is a true duplicate of the first document. A set of candidate documents with a score above the threshold that were not disqualified as candidate documents is then provided.
    Type: Grant
    Filed: May 25, 2007
    Date of Patent: October 25, 2011
    Assignee: Amazon Technologies, Inc.
    Inventors: Srikanth Thirumalai, Aswath Manoharan, Mark J. Tomko, Grant M. Emery, Vijai Mohan, Egidio Terra
  • Publication number: 20110258196
    Abstract: A method of content recommendation, includes: generating a first digital mathematical representation of contents to associate the contents with a first plurality of words describing the contents; generating a second digital mathematical representation of text documents different from the contents to associate the documents with a second plurality of words; processing the first and second pluralities of words to determine a common plurality of words; processing the first and second digital mathematical representations to generate a common digital mathematical representation of the contents and the text documents based on the common plurality of words; and providing content recommendation by processing the common digital mathematical representation.
    Type: Application
    Filed: December 30, 2008
    Publication date: October 20, 2011
    Inventors: Skjalg Lepsoy, Gianluca Francini, Fabrizio Antonelli
  • Patent number: 8041700
    Abstract: A method and apparatus for textual searching of a database is provided herein. During operation a user will input a letter into a search engine. The search engine will score words based on the letter and display results of the highest-scored words. Another letter will again be received and the process repeated. In situations where titles are returned to the user, additional steps of associating the words with a title and scoring the title take place. The highest-scored titles are provided to the user as the displayed results.
    Type: Grant
    Filed: April 7, 2009
    Date of Patent: October 18, 2011
    Assignee: Motorola Mobility, Inc.
    Inventor: Changxue Ma
  • Publication number: 20110252045
    Abstract: Disclosed is a method and system for retrieving data; extracting information from the data; learning to disambiguate the extracted information such that a particular sense of each phrase within the extracted information is determined; generating a disambiguation classifier from the learning to disambiguate step, the disambiguation classifier configured to determine a sense of a phrase within a document; learning to select a portion of the information as being relevant to a theme of the data; generating a selection classifier from the learning to select step, the selection classifier configured to select a topic in a document that is relevant to a theme of the document; and using the disambiguation classifier and the selection classifier by an indexing computer to determine a set of topics from a web document retrieved by the indexing computer.
    Type: Application
    Filed: April 7, 2010
    Publication date: October 13, 2011
    Applicant: Yahoo! Inc.
    Inventors: Priyank Shanker Garg, Rohan Monga, Hemanth Sambrani, Sudharsan Vasudevan
  • Patent number: 8037048
    Abstract: According to the web site search and selection method, in response to a search query a relevance score is assigned to each page of the web sites addressed by the search engine. Then, for each web site addressed by the search engine, the relevance scores of the individual pages are added together, after weighting them by a correction factor indicative at least of the number of pages of the site itself. In this manner, in response to the search query an overall relevance value for the sites addressed by the search engine is obtained.
    Type: Grant
    Filed: February 11, 2008
    Date of Patent: October 11, 2011
    Assignee: Web Lion S.A.S. di Panarese Marco & Co.
    Inventor: Marco Panarese
  • Publication number: 20110246486
    Abstract: Methods and systems for extracting domain phrases are provided. First, a domain phrase database including a plurality of domain phrases is provided. For a candidate phrase, it is determined whether the candidate phrase is a domain phrase according to an occurrence condition of at least one part of the candidate phrase in the domain phrases of the domain phrase database and the occurrence condition of the at least one part of the candidate phrase at different relative positions in respective domain phrases in respective domain phrases.
    Type: Application
    Filed: October 7, 2010
    Publication date: October 6, 2011
    Applicant: INSTITUTE FOR INFORMATION INDUSTRY
    Inventors: Ting-Chun Peng, Chia-Chun Shih, Wen-Tai Hsieh
  • Patent number: 8032537
    Abstract: A method is presented for generating a list of frequently used words for an email application on a server computer. When a request is received for a word frequency list for emails stored in a user's mailbox, a word frequency list is returned if one exists. If the word frequency list does not exist, an asynchronous process is started on the server computer to generate a word frequency list. If the word frequency list exists but it is older than an aging limit, an asynchronous process is started on the server computer to regenerate the word frequency list. The word frequency list is stored in the user's mailbox along with a timestamp indicating the date and time that the list was created or updated.
    Type: Grant
    Filed: December 10, 2008
    Date of Patent: October 4, 2011
    Assignee: Microsoft Corporation
    Inventors: Ashish Consul, Suryanarayana M. Gorti, Michael Geoffrey Andrew Wilson, James C. Kleewein
  • Patent number: 8024344
    Abstract: Presented are systems and methods for securely sharing confidential information. In such a method, term vectors corresponding to ones of a plurality of confidential terms included in a plurality of confidential documents is received. Each of the received term vectors is mapped into a vector space. Non-confidential documents are mapped into the vector space to generate a document vector corresponding to each non-confidential document, wherein the generation of each document vector is based on a subset of the received term vectors. At least one of the non-confidential documents is identified in response to a query mapped into the vector space.
    Type: Grant
    Filed: June 5, 2008
    Date of Patent: September 20, 2011
    Assignee: Content Analyst Company, LLC
    Inventor: Roger Bradford
  • Patent number: 8024342
    Abstract: The present invention is directed towards systems and methods for determining a tag match ratio. The method according to one embodiment of the present invention comprises selecting a content item, identifying one or more tags that are associated with the content item and determining a weight for each of the one or more tags associated with the content item. The method further comprises extracting one or more keywords from the content item. A tag match ratio for the one or more tags associated with the content item is then calculated and stored.
    Type: Grant
    Filed: July 31, 2008
    Date of Patent: September 20, 2011
    Assignee: Yahoo! Inc.
    Inventors: Xin Li, Lei Guo, Eric Zhao
  • Publication number: 20110225155
    Abstract: A system and method are provided for refining a user's query. An entity index, generated from a corpus of text documents, is provided. The entity index includes a set of entity structures, each including a plurality of terms. Each of the terms of an entity structure is a feature of the same entity. Entity structures can be retrieved from the entity index which match at least a portion of the user's query. Clusters of the retrieved entity structures are identified which have at least one of their terms in common. A cluster hierarchy is generated from the identified clusters in which nodes of the hierarchy are defined by one or more of the terms of the retrieved entity structures. At least a portion of the cluster hierarchy is presented to the user for facilitating refinement of the user's query through user selection of a node which, when formulated as a search, retrieves one or more responsive documents from the corpus of documents.
    Type: Application
    Filed: March 10, 2010
    Publication date: September 15, 2011
    Applicant: Xerox Corporation
    Inventors: Frederic Roulland, Stefania Castellani, Antonietta Grasso, Caroline Brun
  • Publication number: 20110225174
    Abstract: Exemplary embodiments are directed to determining a media value associated mentions of an entity in one or more documents based on a sentiment attributed to the mentions of the entity and/or a frequency with which the entity is mentioned. Exemplary embodiments can include a media value engine that can identify mentions of an entity in documents, attribute sentiment to the mentions of the entity; determine a polarity of the sentiment, and calculate a media value attributed to the entity based on the sentiment.
    Type: Application
    Filed: March 14, 2011
    Publication date: September 15, 2011
    Inventors: Greg Artzt, Mark Fasciano, Steve Skiena, Levon Lloyd
  • Publication number: 20110219013
    Abstract: Methods and systems supporting curation of items in a searchable knowledge base are provided. The methods and systems include mining one or more search queries of the searchable knowledge base, where each of the search queries includes a plurality of the items. The method further includes determining one or more pairs of items using a processor, where each of the pairs of items includes a correlation value exceeding a threshold. The correlation values for the pairs of items are based upon the frequency the items of the pairs of items co-occur within the search queries. The method further includes providing the pairs of items to a curator, where the curator reviews the pairs of items.
    Type: Application
    Filed: March 5, 2010
    Publication date: September 8, 2011
    Applicant: PALO ALTO RESEARCH CENTER INCORPORATED
    Inventor: John T. Maxwell, III
  • Publication number: 20110219000
    Abstract: Provided is a search apparatus, a search method, and a program that can improve search speed for a document set even when an object to be searched is a large-scale document set.
    Type: Application
    Filed: November 6, 2009
    Publication date: September 8, 2011
    Inventor: Yukitaka Kusumura
  • Publication number: 20110218994
    Abstract: A system and associated method for automatically processing keyword for video content. The video content contains image frames and an audio stream. An image pattern table for image patterns from the image frames and a word pattern table for word patterns from the audio stream are generated by use of respective pattern names provided by pattern recognition tools. Each pattern is associated with a respective count indicating a number of appearances of each pattern. A respective weight of each pattern is calculated as a relative frequency of each pattern. The image pattern table and the word pattern table are merged to generate a keyword list. A predefined number of most frequently appeared patterns are selected by examining the respective weight of each pattern and metadata associated with the video content are updated to utilize pattern names of the selected patterns as keyword for web searches.
    Type: Application
    Filed: March 5, 2010
    Publication date: September 8, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christopher E. Holladay, William P. Shaouy
  • Patent number: 8015193
    Abstract: A method for accessing a file system including computing a first numerical similarity score for a first stored document and a second numerical similarity score for a second stored document by comparing a plurality of weighted active terms with a plurality of weighted indexed terms, determining a document order of the first stored document followed by the second stored document based on the first numerical similarity score exceeding the second numerical similarity score, generating a list of similar documents including the first stored document followed by the second stored document based on the document order, and displaying, in a file system interface and on the computer display, the list of similar documents while an active document is open in an active document interface.
    Type: Grant
    Filed: December 14, 2010
    Date of Patent: September 6, 2011
    Assignee: Oracle America, Inc.
    Inventors: Stephen J. Green, Jeffrey L. Alexander, Paul B. Lamere
  • Patent number: 8015188
    Abstract: A system and method for thematically grouping documents into clusters is provided. Concepts are extracted from a plurality of documents. The concepts include nouns or noun phrases. A number of occurrences for each concept are determined within each document. A bounded range is applied to the concepts and a subset of the concepts is selected by removing the concepts that fall outside the bounded range. The bounded range includes upper edge conditions and lower edge conditions. Themes are generated from the subset of concepts by identifying two or more concepts with common semantic meaning. Clusters of the documents are generated based on the themes.
    Type: Grant
    Filed: October 4, 2010
    Date of Patent: September 6, 2011
    Assignee: FTI Technology LLC
    Inventors: Dan Gallivan, Kenji Kawai
  • Patent number: 8010529
    Abstract: A system and method for comparing search queries provided by a user with content items available in an index. Search queries are received and stored in a database query log. Content items are located on a network and stored in an index. A value is generated for concepts and categories in the query log and the index. The value for different categories and concepts in the query log is compared with the value for different categories and concepts in the index. A need for content is determined for a given concept and category, which may be communicated to content providers, e.g., web site operators.
    Type: Grant
    Filed: October 23, 2006
    Date of Patent: August 30, 2011
    Assignee: Yahoo! Inc.
    Inventor: Shyam Kapur
  • Patent number: 8010539
    Abstract: Disclosed herein is a method, a system and a computer product for generating a snippet for an entity, wherein each snippet comprises a plurality of sentiments about the entity. One or more textual reviews associated with the entity is selected. A plurality of sentiment phrases are identified based on the one or more textual reviews, wherein each sentiment phrase comprises a sentiment about the entity. One or more sentiment phrases from the plurality of sentiment phrases are selected to generate a snippet.
    Type: Grant
    Filed: January 25, 2008
    Date of Patent: August 30, 2011
    Assignee: Google Inc.
    Inventors: Sasha Blair-Goldensohn, Kerry Hannan, Ryan McDonald, Tyler Neylon, Jeffrey C. Reynar
  • Publication number: 20110208754
    Abstract: A computer implemented method is provided for processing data representing a data entity having sub entities. The method includes analyzing queries to the data entity for deriving information about sets of the sub entities frequently queried together, and grouping the sub entities to a number of banks, each bank having a maximum width, based on the information about sets of sub entities frequently queried together, in order to reduce an average number of banks to be accessed for data retrieval.
    Type: Application
    Filed: November 15, 2010
    Publication date: August 25, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tianchao Li, Peter Bendel, Oliver Draese, Namik Hrle
  • Patent number: 8001130
    Abstract: A method and system is provided for determining relevance of an object to a term based on a language model. The relevance system provides records extracted from web pages that relate to the object. To determine the relevance of the object to a term, the relevance system first determines, for each record of the object, a probability of generating that term using a language model of the record of that object. The relevance system then calculates the relevance of the object to the term by combining the probabilities. The relevance system may also weight the probabilities based on the accuracy or reliability of the extracted information for each data source.
    Type: Grant
    Filed: July 25, 2006
    Date of Patent: August 16, 2011
    Assignee: Microsoft Corporation
    Inventors: Ji-Rong Wen, Shuming Shi, Wei-Ying Ma, Yunxiao Ma, Zaiqing Nie
  • Patent number: 8001131
    Abstract: A physical computing device receives information regarding a total number of people who are searching on the search term. Information is received regarding an amount advertisers pay for the search term. Information is received regarding a click through rate of the search term. A traffic estimate of the search term is determined. Longevity of the search term is determined.
    Type: Grant
    Filed: December 17, 2008
    Date of Patent: August 16, 2011
    Assignee: Demand Media Inc.
    Inventor: Byron William Reese
  • Patent number: 7996410
    Abstract: Techniques for determining when and how to transform words in a query to its plural or non-plural form in order to provide the most relevant search results while minimizing computational overhead are provided. A dictionary is generated based upon the words used in a specified number of previous most frequent search queries and comprises lists of transformations from plural to singular and singular to plural. Unnecessary transformations are removed from the dictionary based upon language modeling. The word to transform is determined by finding the last non-stop re-writable word of the query. The context of the transformed word is confirmed in the search documents and a version of the query is executed using both the original form of the word and the transformation of the word.
    Type: Grant
    Filed: February 1, 2007
    Date of Patent: August 9, 2011
    Assignee: Yahoo! Inc.
    Inventors: Fuchun Peng, Nawaaz Ahmed, Xin Li, Yumao Lu
  • Publication number: 20110191355
    Abstract: A method for monitoring abnormal state of Internet information by monitoring the change of hot words frequency in the Internet information. The method includes the following steps: 1) obtaining the current date word frequency data of common words appearing in the current date web pages; 2) combining with the hot words dictionary that the user focuses on to determine the current date keywords set of the Internet information; 3) determining the weight of each current date keyword; 4) determining the abnormal threshold of the current date keywords; 5) detecting the abnormal level of the current date keywords to determine the current date hot Internet information. The present invention calculates the abnormal level of keywords by monitoring the change of hot words frequency in the Internet information, predicts and gives alarm for the abnormal level of hot words frequency change, which makes the Internet information user react at the first moment.
    Type: Application
    Filed: April 24, 2008
    Publication date: August 4, 2011
    Applicant: PEKING UNIVERSITY
    Inventors: Xun Liang, Hua Chen, Jian Yang
  • Patent number: 7991775
    Abstract: Described herein are techniques for generating a global checkpoint system change number and computing a snapshot query using the global checkpoint system change number without a need to acquire global locks. In many cases, the need to acquire global locks is eliminated, thereby saving the overhead attendant to processing global locks.
    Type: Grant
    Filed: October 2, 2008
    Date of Patent: August 2, 2011
    Assignee: Oracle International Corporation
    Inventors: Neil MacNaughton, Tirthankar Lahiri, Varun Malhotra
  • Patent number: 7987169
    Abstract: Embodiments of methods and apparatuses for searching contents, including structured search are described herein. Embodiments of the present invention use tree structures (or more generally, graph structures), layout structures, and/or content category information to capture within search results relevant content that would otherwise be missed, to reduce the incidence of false positives within search results, and to improve the accuracy of rankings within search results. Embodiments of the present invention further use tree structures (or more generally, graph structures), layout structures, and/or content category information to extend search results to include sub-document constituents. Embodiments of the present invention also support the use of distribution properties as criteria for ranking search results.
    Type: Grant
    Filed: June 12, 2007
    Date of Patent: July 26, 2011
    Assignee: Zalag Corporation
    Inventor: Samuel S. Epstein
  • Patent number: 7987191
    Abstract: A computer-implemented system and process for generating a relationship network is disclosed. The system provides a set of data items to be related and generates variable length data vectors to represent the relationships between the terms within each data item. The system can be used to generate a relationship network for documents, images, or any other type of file. This relationship network can then be queried to discover the relationships between terms within the set of data items.
    Type: Grant
    Filed: November 27, 2007
    Date of Patent: July 26, 2011
    Assignee: The Regents of The University of California
    Inventors: Kasian Franks, Cornelia A. Myers, Raf M. Podowski
  • Patent number: 7979459
    Abstract: Aspects of the subject matter described herein relate to matching product information to products. In aspects, a product matching component receives product information. The product matching component normalizes the product information and obtains keywords from the product information. By querying a database of recognized products, the keywords are used to obtain a list of products that potentially match the product information. A confidence level is assigned to each of the potential matches in the list. A match may be returned for the highest matched product or for a selectable number of products whose confidence level(s) exceed a selectable threshold.
    Type: Grant
    Filed: June 15, 2007
    Date of Patent: July 12, 2011
    Assignee: Microsoft Corporation
    Inventors: Kai Wu, Daniel Takacs, Tong Yao, Jiyu Zhang, Hua Yang, Ji-Rong Wen, Jonathan R M Hart, Eric Anthony Reel
  • Patent number: 7979413
    Abstract: In accordance with an aspect of the invention, a method and system are disclosed for constructing an embedded signature in order to facilitate post-facto detection of leakage of sensitive data. The leakage detection mechanism involves: 1) identifying at least one set of words in an electronic document containing sensitive data, the set of words having a low frequency of occurrence in a first collection of electronic documents; and, 2) transmitting a query to search a second collection of electronic documents for any electronic document that contains the set of words having a low frequency of occurrence.
    Type: Grant
    Filed: May 30, 2008
    Date of Patent: July 12, 2011
    Assignees: AT&T Intellectual Property l, L.P., New York University
    Inventors: Balachander Krishnamurthy, Saurabh Kumar, Lakshminarayanan Subramanian
  • Patent number: 7970774
    Abstract: An exemplary embodiment of the invention relates to a method, system, and storage medium for providing web-based electronic research and presentation functions via a document creation application. The method includes scanning an active document on a computer to identify relevant keywords, searching a database for reference materials relating to the relevant keywords, and displaying relevant reference materials on the computer. The method further includes deploying process software for providing the web-based electronic research and presentation functions via a document creation application. The deployment includes installing the process software on a server, identifying server addresses for users accessing the process software on the server, sending the process software to the server and copying the process software to a file system of the server. The deployment also includes sending the process software to a client computer and executing the process software on the client computer.
    Type: Grant
    Filed: April 18, 2008
    Date of Patent: June 28, 2011
    Assignee: International Business Machines Corporation
    Inventors: Edward E. Kelley, Tijs Y. Wilbrink, Ellis Zijlstra
  • Patent number: 7970775
    Abstract: An exemplary embodiment of the invention relates to a method and storage medium for providing web-based electronic research and presentation functions via a document creation application. The method includes scanning an active document on a computer to identify relevant keywords, searching a database for reference materials relating to the relevant keywords, and displaying relevant reference materials on the computer. The method further includes on-demand sharing of process software for providing the electronic research and presentation functions via the document creation application. The on-demand sharing includes creating a transaction containing unique customer identification, requested service type, and service parameters; sending the transaction to a server; querying the server about processing capacity associated with the server to help ensure availability of adequate resources for processing the transaction; and allocating additional processing capacity when needed to process the transaction.
    Type: Grant
    Filed: April 18, 2008
    Date of Patent: June 28, 2011
    Assignee: International Business Machines Corporation
    Inventors: Edward E. Kelley, Tijs Y. Wilbrink, Ellis Zijlstra
  • Publication number: 20110153585
    Abstract: The method according to one embodiment of the present invention comprises retrieving one or more terms or phrases comprising an instant messaging conversation in which one or more users are participating. One or more term vectors comprising one or more vector terms associated with the one or more retrieved terms or phrases comprising the instant messaging conversation are generated and one or more vector terms are selected from said term vectors. The one or more selected vector terms are displayed to the one or more users participating in the instant messaging conversation. An indication of a user selection of a given displayed vector term is received and one or more content items responsive to the selected vector term are identified.
    Type: Application
    Filed: February 24, 2011
    Publication date: June 23, 2011
    Applicant: YAHOO! INC.
    Inventor: Shiv Ramamurthi
  • Patent number: 7962480
    Abstract: The relevance of documents is automatically determined based upon a weighted tree. Terms considered to be relevant are assigned to the leaf nodes of a tree data structure. A location can also be specified in a leaf node, indicating where in a document the term must appear to be considered relevant. Internal nodes of the tree are assigned operators (e.g., add, maximum or minimum). The connections between nodes are assigned weights. A relevance value for a given document is calculated as a function of occurrence in the document of terms assigned to leaves, operators assigned to internal nodes, and weights assigned to the associated node connections. Weighted trees can be used to process search queries. Documents with high relevance scores calculated against the tree can be returned to a user as the results to a query.
    Type: Grant
    Filed: July 31, 2007
    Date of Patent: June 14, 2011
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Li Zhang, Yuhong Xiong, Shicong Feng, Yong Zhao
  • Publication number: 20110137906
    Abstract: A method for analyzing sentiment comprising: collecting an object from an external content repository, the collected objects forming a content database; extracting a snippet related to the subject from the content database; calculating a sentiment score for the snippet; classifying the snippet into a sentiment category; creating sentiment taxonomy using the sentiment categories, the sentiment taxonomy classifying the snippets as positive, negative or neutral; identifying topic words within the sentiment taxonomy; classifying the topic words as a sentiment topic word candidates or a non-sentiment topic word candidate, filtering the non-sentiment topic word candidates; identifying the frequency of the non-sentiment topic words in each of the sentiment categories; identifying the importance of the non-sentiment topic word for each of the sentiment categories; and, ranking the topic word, wherein the rank is calculated by combining the frequency of the topic words in each of the categories with its importance.
    Type: Application
    Filed: December 9, 2009
    Publication date: June 9, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES, INC.
    Inventors: Keke Cai, Ying Chen, W. Scott Spangler, LI Zhang
  • Patent number: 7958125
    Abstract: A method for merging really simple syndication (RSS) feeds. Stories containing one or more terms may be merged into one or more clusters based on one or more links between the stories. A cluster frequency with which the terms occur in each cluster may be determined. A diameter for each cluster may be determined. A cluster that is most similar to one of the clusters may be determined based on the cluster frequency. The most similar cluster with the one of the clusters may be determined based on each diameter, and each cluster frequency.
    Type: Grant
    Filed: June 26, 2008
    Date of Patent: June 7, 2011
    Assignee: Microsoft Corporation
    Inventors: Jun Yan, Ning Liu, Lei Ji, Zheng Chen, Jian Wang
  • Patent number: 7949663
    Abstract: A computer implemented system for project prediction is provided. The system includes a data manager to obtain historical project data. The system also includes an analyzer to analyze the historical project data and an analysis cycle time to generate models for a proposed project cycle time. Additionally, the system includes a user interface to select one model for the proposed project cycle time, wherein the selected model includes linear sub-models corresponding to a historical data range, and apply proposed project data and analysis cycle time to one linear sub-model corresponding to a proposed data range to predict the proposed project cycle time. Furthermore, the system captures proposed project data and obtains additional project data to update the selected model. The models provide for the accurate prediction of cycle times, or project costs, in an enterprise development environment.
    Type: Grant
    Filed: February 8, 2007
    Date of Patent: May 24, 2011
    Assignee: Sprint Communications Company L.P.
    Inventors: Deandra T. Cassone, Joseph E. Dudley, George R. Kather, Paul R. Sapenaro, Jason N. Ward
  • Patent number: 7949643
    Abstract: Generally, a method and apparatus provides for rating user generated content (UGC) with respect to search engine results. The method and apparatus includes recognizing a UGC data field collected from a web document located at a web location. The method and apparatus calculates: a document goodness factor for the web document; an author rank for an author of the UGC data field; and a location rank for web location. The method and apparatus thereby generates a rating factor for the UGC field based on the document goodness factor, the author rank and the location rank. The method and apparatus also outputs a search result that includes the UGC data field positioned in the search results based on the rating factor.
    Type: Grant
    Filed: April 29, 2008
    Date of Patent: May 24, 2011
    Assignee: Yahoo! Inc.
    Inventors: Jaya Kawale, Aditya Pal