Based On Term Frequency Of Appearance Patents (Class 707/750)
-
Patent number: 8566323Abstract: Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position. This position relative to another file's position reveals closest neighbors. When multiple such neighbors are arranged, first “patterns” of data are created that further define digital spectrums of new files. It is within this sorted new data that emergent relationships or second “patterns” are examined, according to the techniques for its underlying files, or “patterns of patterns.” Representatively, original files are stored on computing devices. If encoded, they have pluralities of symbols representing an underlying data stream of original bits of data. The original files are examined for relationships between each of the files. The original relationships are converted to new files. The new files are representatively encoded and examined for other relationships.Type: GrantFiled: December 29, 2009Date of Patent: October 22, 2013Assignee: Novell, Inc.Inventors: Scott A. Isaacson, Craig N. Teerlink, Nadeem A. Nazeer
-
Publication number: 20130275436Abstract: Various embodiments promote the discoverability of data that can be contained within a database. In one or more embodiments, data within a database is organized in a structure having a schema. The structure and data can be processed in a manner that renders one or more pseudo-documents each of which constitutes a sub-structure that can be indexed. Once produced and indexed, the pseudo-documents constitute a set of searchable objects each of which relationally points back to its associated structure within the database. Searches can now be performed against the pseudo-documents which, in turn, returns a set of search results. The set of search results can include multiple sub-sets of pseudo-documents, each sub-set of which is associated with a different structure.Type: ApplicationFiled: April 11, 2012Publication date: October 17, 2013Applicant: Microsoft CorporationInventors: Surajit Chaudhuri, Lev Novik, John C. Platt
-
Patent number: 8559724Abstract: An apparatus and method for generating additional information about moving picture content, including: comparing image feature information about each image frame in moving picture content with image feature information about each image frame in web information, searching for an image frame in the moving picture content, the image frame matching the image frame in the web information, determining location information about the found image frame in the moving picture content, and generating additional information by use of the determined location information and the web information.Type: GrantFiled: February 24, 2010Date of Patent: October 15, 2013Assignee: Samsung Electronics Co., Ltd.Inventors: Yoon-hee Choi, Il-hwan Choi, Hee-seon Park
-
Publication number: 20130268482Abstract: Systems, methods, and computer-readable media for determining the Internet search popularity of an entity are provided. Embodiments of the present invention include receiving a group of Internet search records and assigning a popularity ranking based on the number of times an entity descriptor associated with an entity occurs within the group of Internet search records created over a designated time period. An entity descriptor is one or more terms commonly used to identify an entity. The trend in an entity's popularity rank may also be calculated. An entity's popularity rank and trend in popularity rank may be presented in a graph or in a list.Type: ApplicationFiled: March 14, 2013Publication date: October 10, 2013Inventors: Tabreez Govani, Hugh Williams, Jamie Buckley, Nitin Agrawal, Andy Lam, Kenneth A. Moss
-
Publication number: 20130262481Abstract: A system and a method are disclosed for identifying video files on a webpage and streaming video files to a client device. A server receives browsing data including uniform resource locator for a webpage and identifies missing videos on the webpage. The server identifies a source file for the missing videos including identifying a location for each missing video. The server retrieves a thumbnail for each missing video and provides it to a client device. Additionally, the server transcodes the video file responsive to a user input provided by a user. The transcoded video is streamed to the client device.Type: ApplicationFiled: May 10, 2013Publication date: October 3, 2013Applicant: Skyfire Labs, Inc.Inventors: Nitin Bhandari, Erik R. Swenson, Geoffrey Dale Benson, Ishika Paul, James Marzano, Jaime Heilpern, Robert Oberhofer, Michael Guzewicz, Vijay Kumar
-
Publication number: 20130246386Abstract: Systems are used for identifying key phrases within documents. These systems utilize a tags and a tag index to determine what a document primarily relates to. For example, an integrated data flow and extract-transform-load pipeline, crawls, parses and word breaks large corpuses of documents in database tables. Documents can be broken into tuples. The tuples can be sent to a heuristically based algorithm that uses statistical language models and weight plus cross-entropy threshold functions to summarize the document into its “top N” most statistically significant phrases. These systems can scale efficiently (e.g., linearly) and (potentially large numbers of) documents can be characterized by salient and relevant key phrases (tags).Type: ApplicationFiled: March 11, 2013Publication date: September 19, 2013Applicant: MICROSOFT CORPORATIONInventors: Sorin Gherman, Kunal Mukerjee
-
Patent number: 8533195Abstract: Electronic documents are retrieved from a database and/or from a network of servers. The documents are topic modeled in accordance with a Regularized Latent Semantic Indexing approach. The Regularized Latent Semantic Indexing approach may allow an equation involving an approximation of a term-document matrix to be solved in parallel by multiple calculating units. The equation may include terms that are regularized via either l1 norm and/or via l2 norm. The Regularized Latent Semantic Indexing approach may be applied to a set, or a fixed number, of documents such that the set of documents is topic modeled. Alternatively, the Regularized Latent Semantic Indexing approach may be applied to a variable number of documents such that, over time, the variable of number of documents is topic modeled.Type: GrantFiled: June 27, 2011Date of Patent: September 10, 2013Assignee: Microsoft CorporationInventors: Jun Xu, Hang Li, Nicholas Craswell
-
Publication number: 20130232154Abstract: Systems and methods of identifying and categorizing social network messages that are relevant to selected categories and text terms are provided. The frequency of text terms appearing in social network messages are calculated for multiple categories. Based on the calculated text term frequency, social network messages can be identified and/or categorized that match a provided set of text terms. Selecting and/or associating text terms and categories are determined by repeatedly analyzing social network messages.Type: ApplicationFiled: April 11, 2013Publication date: September 5, 2013Applicant: CitizenNet Inc.Inventors: Michael Aaron Hall, Daniel Benyamin, Aaron Chu
-
Patent number: 8515971Abstract: The present invention relates to a method for assisting a user in making a decision to compare biometric data of an individual with data from a database relating to a large number of individuals, and biometric data is acquired for an individual concerned, that this data is encoded, that the data items are compared in pairs with corresponding data from the database, that, for each comparison score the duplicate occurrence frequency/non-duplicate occurrence frequency ration is established, that the product of all the available ratios is calculated, that this product is standardized, that the standardized ratio is compared to a pre-set threshold, that the values greater than the pre-set threshold are kept and that this result is submitted to the user for him to validate it as appropriate.Type: GrantFiled: November 2, 2006Date of Patent: August 20, 2013Assignee: ThalesInventor: Jean Beaudet
-
Patent number: 8515975Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using search entity transition probabilities. In some implementations, data identifying entities and transition probabilities between entities is stored in a computer readable medium. Each transition probability represents a strength of a relationship between a pair of entities as they are related in search history data. In some implementations, an increase in popularity for a query is identified and a different query is identified as temporally related to the query. Scoring data for documents responsive to the different query is modified to favor newer documents. In other implementations, data identifying a first session as spam is received, and a spam score is calculated for either a second session of queries or a single query using transition probabilities. The second session (or single query) is identified as spam from the spam score.Type: GrantFiled: December 7, 2009Date of Patent: August 20, 2013Assignee: Google Inc.Inventor: Diego Federici
-
Patent number: 8515974Abstract: A method is presented for generating a list of frequently used words for an email application on a server computer. When a request is received for a word frequency list for emails stored in a user's mailbox, a word frequency list is returned if one exists. If the word frequency list does not exist, an asynchronous process is started on the server computer to generate a word frequency list. If the word frequency list exists but it is older than an aging limit, an asynchronous process is started on the server computer to regenerate the word frequency list. The word frequency list is stored in the user's mailbox along with a timestamp indicating the date and time that the list was created or updated.Type: GrantFiled: September 2, 2011Date of Patent: August 20, 2013Assignee: Microsoft CorporationInventors: Ashish Consul, Suryanarayana M. Gorti, Michael Geoffrey Andrew Wilson, James C. Kleewein
-
Patent number: 8515973Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for identifying geographic features. In one aspect, a method includes receiving a query. Geographic features are identified, each geographic feature being associated with one or more names, each geographic feature being associated with at least one name that includes the query. A feature-query score is computed for each geographic feature, including: for each name of the geographic feature that includes the query, identifying a computed feature-name score, wherein the feature-name score is computed based on a count of a number of occurrences of the name in a query log, wherein each occurrence is attributed to the feature; and computing the feature-query score based on the identified feature-name scores. The geographic features are ranked according to the feature-query scores.Type: GrantFiled: March 25, 2011Date of Patent: August 20, 2013Assignee: Google Inc.Inventors: Radu Jurca, Anja Hauth, Ivan Zauharodneu, Matsvei Zhdanovich, Luuk Van Dijk, Steffen Meschkat, David E. Lecomte
-
Patent number: 8515972Abstract: A programmed computer receives one or more documents that contain text that is relevant to a user (“interest documents”). The programmed computer automatically identifies groups of words that match the interest documents. The matching word groups are ranked by a weight that is assigned based on how infrequently a word group matches a reference corpus and how frequently the word group matches one or more interest document(s), in comparison to other word groups. A set of word groups are automatically identified based on ranking, and displayed to a user to select documents from a corpus. Selected documents are displayed to the user, e.g. with one or more group of words used in selecting the documents.Type: GrantFiled: February 10, 2010Date of Patent: August 20, 2013Assignee: Python 4 Fun, Inc.Inventors: Devabhaktuni Srikrishna, Marc Coram
-
Patent number: 8510314Abstract: Methods, systems, and apparatus, including computer program products are provided for ranking distinct book content items based on implicit links to other distinct book content items. The implicit links are defined based on the identification of matching features in the distinct book content items. In some implementations, the matching features are uncommon phrases in textual content of the distinct book content items. Edges representing implicit links are generated between distinct nodes representing distinct book content items in a weighted graph. Search results for distinct book content items can be ordered based on the edges connected to the distinct nodes in the weighted graph that represent the distinct book content items.Type: GrantFiled: October 6, 2011Date of Patent: August 13, 2013Assignee: Google Inc.Inventors: Shumeet Baluja, Yushi Jing
-
Patent number: 8504563Abstract: Sorting inquiry results includes, based on extracted inquiry results matching search conditions of a user, collecting features of the inquiry results. The collected features may be used as features of a respective inquiry result and feature fitting may be conducted based on a support vector machine (SVM) regression model to obtain a feature fitting value of the respective inquiry result. The inquiry results may be sorted based on relevancy values of the inquiry results, and, for inquiry results having a same relevancy level, the inquiry results may be sorted in a top-down manner based on feature fitting values of the inquiry results.Type: GrantFiled: July 22, 2011Date of Patent: August 6, 2013Assignee: Alibaba Group Holding LimitedInventors: Chao Chen, Xiaomei Han
-
Patent number: 8504357Abstract: A related word presentation device includes a program information storage unit that stores program information of each program; and an information dividing unit that generates, for each of the attributes of the words included in the program information, at least one group which includes a reference word belonging to the attribute and a set of words which co-occur with the reference word in a program. A degree-of-relevance calculating unit stores attribute-based association dictionaries each of which indicates, for the corresponding attribute of words, (i) the words and (ii) the degrees of relevance between the words calculated based on the frequency of co-occurrence in each of groups. A search condition obtaining unit obtains the search word and the attribute; a substitute word obtaining unit selects substitute words from the attribute-based association dictionary for the obtained attribute; and an output unit presents the selected substitute word.Type: GrantFiled: July 30, 2008Date of Patent: August 6, 2013Assignee: Panasonic CorporationInventors: Takashi Tsuzuki, Satoshi Matsuura, Kazutoyo Takata
-
Patent number: 8504564Abstract: A method, apparatus and computer program product provides for a semantic analyzer to produce and rank semantic terms to reflect their relationship to the theme and topics of a document. The text and the document can have no relationship to any pre-selected keywords before the semantic analyzer performs text extraction. The semantic analyzer extracts text from a document and performs semantic analysis on the extracted text. The semantic analyzer provides a plurality of ranked semantic terms as a result of the semantic analysis and associates semantic terms with the document as semantic keywords. The semantic terms define content to be presented with the document where the content is an advertisement, a link to a remote information resource or a second document.Type: GrantFiled: December 15, 2010Date of Patent: August 6, 2013Assignee: Adobe Systems IncorporatedInventors: Walter Chang, Nadia Ghamrawi
-
Patent number: 8504578Abstract: A system, method and computer program product for identifying near and exact-duplicate documents in a document collection, including for each document in the collection, reading textual content from the document; filtering the textual content based on user settings; determining N most frequent words from the filtered textual content of the document; performing a quorum search of the N most frequent words in the document with a threshold M; and sorting results from the quorum search based on relevancy. Based on the values of N and M near and exact-duplicate documents are identified in the document collection.Type: GrantFiled: August 16, 2012Date of Patent: August 6, 2013Assignee: MSC Intellectual Properties B.V.Inventors: Johannes C. Scholtes, Siebe Bloembergen
-
Patent number: 8504561Abstract: Techniques are described herein for using intent to access a domain (i.e., domain intent) to provide more search results that correspond to the domain. For example, a rule may specify a maximum number of search results that are allowed to be provided from a domain (or a host that corresponds to the domain) in response to a search query. Each search query may include any number of ngrams. An ngram is a subsequence of elements in a sequence (e.g., a search query). An intent to access a domain may be determined based on one or more of the ngrams in a search query. A number of search results that correspond to a domain may be increased to be greater than the maximum number based on one or more of the ngrams that are included in the search query being associated with the intent to access the domain.Type: GrantFiled: September 2, 2011Date of Patent: August 6, 2013Assignee: Microsoft CorporationInventors: Timothy C. Hoad, Deepak Vijaywargi, Yatharth Saraf
-
Patent number: 8489617Abstract: Disclosed are systems for, and methods of, automatically detecting and treating field values of a particular field as null field values in records of a database. The system and method provide automatic treatment of these field values as null field values by calculating a critical frequency for the field. Based on the critical frequency of the field, the system and method treats field values that occur more than the critical frequency of the field as null field values and treats field values that occur less than the critical frequency as non-null field values.Type: GrantFiled: June 5, 2012Date of Patent: July 16, 2013Assignee: LexisNexis Risk Solutions FL Inc.Inventor: David Alan Bayliss
-
Publication number: 20130173568Abstract: Methods and systems are provided that may be utilized to generate website link suggestions.Type: ApplicationFiled: December 28, 2011Publication date: July 4, 2013Applicant: YAHOO! INC.Inventors: Vanja Josifovski, Evgeniy Gabrilovich, Bo Pang, Fernando Diaz, Jangwon Seo
-
Patent number: 8473498Abstract: A method of text analytics includes filtering a plurality of unfiltered records having unstructured data into at least a first group and a second group. The first group and said second group each include at least two records and the first group is different than the second group. The method includes determining a first proportion of occurrence for a term by comparing a first number of records having at least one occurrence of the term in the first group to a first total number of records in the first group, determining a second proportion of occurrence for the term by comparing a second number of records having at least one occurrence of the term in said second group to a second total number of records in the second group, and comparing the first proportion of occurrence to the second proportion of occurrence to yield a resultant comparison occurrence.Type: GrantFiled: August 2, 2011Date of Patent: June 25, 2013Inventor: Tom H. C. Anderson
-
Publication number: 20130151538Abstract: An entity summarization system is described herein that mines the Internet and other data source to provide answers to questions such as the relative sentiment of users towards various brands. The system uses a controlled vocabulary list describing a specific aspect of entities of interest. Given an entity name, the system scans the whole content corpus to collect statistics on the words that occur most frequently in the context of the entity name, taking into account proximity information, to produce a weighted list of vocabulary terms describing the entity. Two entities can be compared by normalizing and comparing their weighted term lists. In some embodiments, the system performs these procedures efficiently by leveraging an N-gram web model. Thus, the system provides an automated way to compare two entities to derive information about how users feel about the entities at any given time.Type: ApplicationFiled: December 12, 2011Publication date: June 13, 2013Applicant: MICROSOFT CORPORATIONInventors: Pavel Dmitriev, Wei Zhuang
-
Patent number: 8463827Abstract: Embodiments are directed towards identifying auto-folder tags for messages by using a combinational optimization approach of bi-clustering folder names and features of messages based on relationship strengths. The combinational optimization approach of bi-clustering, generally, groups a plurality of folder names and a plurality of features into one or more metafolders to optimize a cost. The cost is based on an aggregate of cut relationship strengths, where a cut results when a relationship folder name and feature are grouped in separate metafolders. Furthermore, the plurality of folder names and the plurality of features are obtained by monitoring actions of a plurality of users, where the folder names are user generated folder names and features are from a plurality of messages. The metafolders may be used to tag new user messages with an auto-folder tag.Type: GrantFiled: January 4, 2011Date of Patent: June 11, 2013Assignee: Yahoo! Inc.Inventors: Vishwanath Tumkur Ramarao, Andrei Broder, Idan Szpektor, Edo Liberty, Yehuda Koren, Mark E. Risher, Yoelle Maarek Smadja
-
Patent number: 8458198Abstract: A term analyzer receives an ordered collection of text-based terms. The term analyzer analyzes groupings of consecutive text-based terms in the ordered collection to identify occurrences of different combinations of text-based terms. In addition, the term analyzer maintains frequency information representing the occurrences of the different combinations of text-based terms in the collection. The frequency information can then be used to determine relatively significant keywords and/or keyword phrases in the document. In an example configuration, the term analyzer creates a tree in which a first term in a given grouping of the groupings is defined as a parent node in the tree and a second term in the given grouping is defined as a child node of the parent node in the tree. The method of the analyzer generalizes to create a tree of multi-word terms in which the terms can be efficiently ranked by occurrence.Type: GrantFiled: December 5, 2011Date of Patent: June 4, 2013Assignee: Adobe Systems IncorporatedInventors: Michael J. Welch, Walter W. Chang
-
Patent number: 8452774Abstract: A method and system for splitting a text document into individual sentences using sentence boundary detection, and establishing co-relationships between terms which are present in the same sentence. A document corpus, or collection of text records, is provided, containing text with terms to be extracted. The text records in the document corpus are divided into individual sentences, using a set of rules for sentence boundary detection. The individual sentences are then analyzed to extract and correlate terms, such as parts and symptoms, symptoms and actions, or parts and failure modes. The correlated terms are then validated based on frequency of occurrence, with term pairs being considered valid if their frequency of occurrence exceeds a minimum frequency threshold. The validated term correlations can be used for fault model development, document classification, and document clustering.Type: GrantFiled: March 10, 2011Date of Patent: May 28, 2013Assignee: GM Global Technology Operations LLCInventor: Dnyanesh Rajpathak
-
Publication number: 20130132407Abstract: Various embodiments of methods and apparatus for fitting a surface to a data set are disclosed. A frequency distribution of an input data set is determined. Determining the frequency distribution includes assigning each data point of the input data set to a category representing a value of a variable for the respective data point. Responsive to identifying one or more discontinuities of the frequency distribution, a continuous section of the frequency distribution is identified as a first data set. A first equation is fit to the first data set.Type: ApplicationFiled: February 25, 2011Publication date: May 23, 2013Inventors: Balaji Krishnmurthy, Anubha Rastogi
-
Publication number: 20130124541Abstract: A method and system for collaborating tags in a bookmarking system wherein the bookmarking system includes a plurality of tags applied to content items by a plurality of users, the method and system including, examining all the tags that are applied to all the content items, determining whether two tags have been assigned to the same content item, if two tags have been assigned to the same content item, computing the relative strength of each of the two tags with respect to each other.Type: ApplicationFiled: January 2, 2013Publication date: May 16, 2013Applicant: International Business Machines CorporationInventor: International Business Machines Corporation
-
Patent number: 8442988Abstract: A cell-specific dictionary is applied adaptively to adequate cells, where the cell-specific dictionary subsequently optimizes the handling of frequency-partitioned multi-dimensional data. This includes improved data partitioning with super cells or adjusting resulting cells by sub-dividing very large cells and merging multiple small cells, both of which avoid the highly skewed data distribution in cells and improve the query processing. In addition, more efficient encoding is taught within a cell in case the distinct values that actually appear in that cell are much smaller than the size of the column dictionary.Type: GrantFiled: November 4, 2010Date of Patent: May 14, 2013Assignee: International Business Machines CorporationInventors: Oliver Draese, Namik Hrle, Oliver Koeth, Tianchao Li, Vijayshankar Raman, Knut Stolze
-
Patent number: 8429176Abstract: The present invention is directed towards systems and methods for extending media annotations using collective knowledge. The method according to one embodiment of the present invention comprises receiving a plurality of content items and associated annotations. The method further normalizes the plurality of associated annotations and calculates pair frequencies for the plurality of associated annotations. The method then retrieves a plurality of alternative annotations and provides the plurality of alternative annotations.Type: GrantFiled: March 28, 2008Date of Patent: April 23, 2013Assignee: Yahoo! Inc.Inventors: Borkur Sigurbjornsson, Roelof van Zwol
-
Publication number: 20130091151Abstract: In accordance with disclosed embodiments, there are provided methods, systems, and apparatuses for performing time-partitioned collaborative filtering in an on-demand service environment including, for example, receiving as input, a plurality of access requests for data stored within the host organization and a corresponding plurality of actions for the data to which access is requested; accessing an input table having a time field, action field, item field, and agent field therein; recording time data and agent data for each of the received plurality of access requests and the corresponding plurality of actions; recording an item within the item field and an action within the action field for each of the received plurality of access requests and the corresponding plurality of actions based on the action performed on an item of the data to which access is requested; and analyzing the input table to generate one or more pairs of first actions and items to second actions and items and a time based score for eacType: ApplicationFiled: October 2, 2012Publication date: April 11, 2013Applicant: SALESFORCE.COM, INC.Inventor: Salesforce.com, Inc.
-
Publication number: 20130086086Abstract: A computer-readable recording medium stores a program causing a computer to execute an information generating process that includes tabulating an appearance frequency for each designated word in an object file group in which character strings are described; identifying for each designated word and based on the appearance frequency tabulated for the designated word, a rank in descending order up to a target appearance rate for the designated words; detecting in an object file selected from the object file group, specific designated words among the identified ranks; and generating for each of the detected specific designated words, index information that indicates the presence/absence of the specific designated word in each object file among the object file group.Type: ApplicationFiled: November 27, 2012Publication date: April 4, 2013Applicant: FUJITSU LIMITEDInventor: FUJITSU LIMITED
-
Publication number: 20130086085Abstract: A computer-readable recording medium has stored therein a program for causing a computer to execute an analysis support process that includes storing to a storage device, a name of a second process that is a process included among a plurality of processes called in response to execution of a program, the computer storing the name of the second process when a first process having a name that matches a keyword stored in a storage device is included among the processes.Type: ApplicationFiled: August 13, 2012Publication date: April 4, 2013Applicant: FUJITSU LIMITEDInventor: Shingo KATO
-
Patent number: 8407233Abstract: A method and system for calculating a relevance between words using a document set is provided. The method of calculating the relevance between words based on a document set, includes: obtaining statistical information about the words based on at least one of the words, documents, a word classification of the words, and a document classification of the documents, wherein the words and the documents are included in the document set; standardizing the statistical information; and calculating the relevance between the words based on the standardized statistical information.Type: GrantFiled: December 10, 2007Date of Patent: March 26, 2013Assignee: NHN Business Platform CorporationInventors: Ki Ho Song, Byoung Hak Kim, Min uk Kim, Tae Yeong Kwak
-
Patent number: 8407216Abstract: Embodiments of the present invention provide systems and methods for automatically generating tag terms (or tags) for objects in databases of a web site. The metadata of the objects (or data) of the web site are processed and parsed to automatically generate tag terms for the corresponding objects. Information (or data, or content) downloaded from the Internet often comes with metadata, which can exist in titles, description, sources, and authors of the information, etc. The metadata of downloaded information can be process and parsed to generate tag terms for the corresponding objects. The system can automatically generate tag terms for the data, which are stored as objects in the databases, and make the data (or objects) searchable. In addition, the automatically generated tag terms allow associated data to maintain their relationship. For example, data from the same sources, same authors, or same subjects can be identified based on the common tag terms.Type: GrantFiled: September 25, 2008Date of Patent: March 26, 2013Assignee: Yahoo! Inc.Inventors: Hubert M. Walker, Noel C. Morrison, Timothy Caplis, Scott Bedard, Ankarino S. Lara, Stephen James Blake
-
Patent number: 8402035Abstract: Exemplary embodiments are directed to determining a media value associated mentions of an entity in one or more documents based on a sentiment attributed to the mentions of the entity and/or a frequency with which the entity is mentioned. Exemplary embodiments can include a media value engine that can identify mentions of an entity in documents, attribute sentiment to the mentions of the entity; determine a polarity of the sentiment, and calculate a media value attributed to the entity based on the sentiment.Type: GrantFiled: March 14, 2011Date of Patent: March 19, 2013Assignee: General Sentiment, Inc.Inventors: Greg Artzt, Mark Fasciano, Steve Skiena, Levon Lloyd
-
Patent number: 8402032Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for correcting entity names. One method includes receiving texts and deriving a plurality of name-context pairs from the texts. The method further includes calculating a context consistency measure for each name-context pair and storing context-entity name data representing the name-context pairs. Another method includes identifying an entity name and one or more context terms from a query and generating candidate names for the entity name. The method further includes determining a score for each of the candidate names, selecting a number of top scoring candidate names, and using the selected candidate names to respond to the query.Type: GrantFiled: March 24, 2011Date of Patent: March 19, 2013Assignee: Google Inc.Inventors: Lawrence J. Brunsman, Matthieu Devin, Uri N. Lerner, Simon Tong
-
Using a dynamically-generated content-level newsworthiness rating to provide content recommendations
Patent number: 8402034Abstract: A method for providing content-level data artifact recommendations can begin with the creation of a semantic library from the textual content of data artifacts by a newsworthy content recommendation engine. A base newsworthiness rating can be calculated using global newsworthiness parameters and behavioral functions that model newsworthy influences for each relationship contained in the semantic library. A user-specific search network can be generated that represents user-entered criteria and/or user task-related criteria. Within the semantic library, potential newsworthy semantic networks can be identified. Newsworthy content from each identified potential newsworthy semantic network can be dynamically determined based upon the base newsworthiness rating and a predefined newsworthiness threshold. The newsworthy content from the identified potential newsworthy semantic network can be related to the user-specific search network at the common node, creating a newsworthy content recommendation graph.Type: GrantFiled: March 2, 2012Date of Patent: March 19, 2013Assignee: International Business Machines CorporationInventors: Daniel John McCloskey, Marcello Trovati, Carol Sue Zimmet -
Patent number: 8402036Abstract: Disclosed herein is a method, a system and a computer product for generating a snippet for an entity, wherein each snippet comprises a plurality of sentiments about the entity. One or more textual reviews associated with the entity is selected. A plurality of sentiment phrases are identified based on the one or more textual reviews, wherein each sentiment phrase comprises a sentiment about the entity. One or more sentiment phrases from the plurality of sentiment phrases are selected to generate a snippet.Type: GrantFiled: June 24, 2011Date of Patent: March 19, 2013Assignee: Google Inc.Inventors: Sasha Blair-Goldensohn, Kerry Hannan, Ryan T. McDonald, Tyler Neylon, Jeffrey C. Reynar
-
Patent number: 8402022Abstract: Tools and techniques for converging terms within a collaborative tagging environment are described herein. Methods for converging divergent contributions to the collaborative tagging environment may include receiving respective contributions from users within the environment. The methods may identify at least some of the contributions as divergent, and enable the users to converge the divergent contributions.Type: GrantFiled: September 29, 2006Date of Patent: March 19, 2013Inventors: Martin R. Frank, Walter Manching Tseng
-
Patent number: 8396879Abstract: One or more server devices may simultaneously calculate first ranking scores for a group of users and second ranking scores for a group of comments authored by the group of users. The calculating may occur during a same process. The one or more server devices may further provide one of a first ranked list that includes information identifying the group of users, the information identifying the group of users being ordered based on the first ranking scores, or a second group of comments of the group of comments, the comments in the second group of comments being ordered based on the second ranking scores.Type: GrantFiled: February 28, 2012Date of Patent: March 12, 2013Assignee: Google Inc.Inventors: Michal Cierniak, Na Tang
-
Patent number: 8392398Abstract: A method for executing a query on a graph data stream. The graph stream comprises data representing edges that connect vertices of a graph. The method comprises constructing a plurality of synopsis data structures based on at least a subset of the graph data stream. Each vertex connected to an edge represented within the subset of the graph data stream is assigned to a synopsis data structure such that each synopsis data structure represents a corresponding section of the graph. The method further comprises mapping each received edge represented within the graph data stream onto the synopsis data structure which corresponds to the section of the graph which includes that edge, and using the plurality of synopsis data structures to execute the query on the graph data stream.Type: GrantFiled: July 29, 2009Date of Patent: March 5, 2013Assignee: International Business Machines CorporationInventors: Charu C. Aggarwal, Min Wang, Peixiang Zhao
-
Patent number: 8380718Abstract: A system and method for grouping similar documents is provided. Frequencies of occurrences are determined for terms and noun phrases within a set of documents. A subset of the documents is selected by removing those documents having terms and noun phrases that fall outside a bounded range of upper and lower conditions for frequency of occurrence. Each of the documents in the subset is mapped to a cluster of documents based on a similarity of the documents to the cluster documents.Type: GrantFiled: September 2, 2011Date of Patent: February 19, 2013Assignee: FTI Technology LLCInventors: Dan Gallivan, Kenji Kawai
-
Publication number: 20130041906Abstract: A privacy-preserving system and method is disclosed for profiling clients within a system for knowledge management. The method of the present invention discloses steps for generating a client profile in support of receiving and processing messages using scoring techniques and/or filtering techniques. The method of the present invention further includes steps for generating a client profile in support of a method for generating and obtaining responses to messages using scoring techniques and/or filtering techniques. The system of the present invention, includes all means for implementing the method.Type: ApplicationFiled: August 7, 2012Publication date: February 14, 2013Inventors: Eytan Adar, Rajan Mathew Lukose, Joshua Rogers Tyler, Caesar Sengupta
-
Patent number: 8375036Abstract: Methods, systems, and apparatus, including computer program products are provided for ranking distinct book content items based on implicit links to other distinct book content items. The implicit links are defined based on the identification of matching features in the distinct book content items. In some implementations, the matching features are uncommon phrases in textual content of the distinct book content items. Edges representing implicit links are generated between distinct nodes representing distinct book content items in a weighted graph. Search results for distinct book content items can be ordered based on the edges connected to the distinct nodes in the weighted graph that represent the distinct book content items.Type: GrantFiled: November 17, 2011Date of Patent: February 12, 2013Assignee: Google Inc.Inventors: Shumeet Baluja, Yushi Jing
-
Patent number: 8368918Abstract: Methods and apparatus to methods and apparatus to identify images in print advertisements are disclosed. An example method comprises computing a first image feature vector for a first presented image, comparing the first image feature vector to a second image feature vector, and when the first image feature vector matches the second image feature vector, storing printed-media information associated with the first presented image in a database record associated with the second image feature vector.Type: GrantFiled: September 14, 2007Date of Patent: February 5, 2013Assignee: The Nielsen Company (US), LLCInventors: Kevin Deng, Alan Nguyen Bosworth
-
Patent number: 8370366Abstract: Embodiments of systems and methods for comparing attributes of a data record are presented herein. Broadly speaking, embodiments of the present invention generate a weight based on a comparison of the name (or other) attributes of data records. More particularly, embodiments of the present invention generate a weight based on a comparison of name attributes. More specifically, embodiments of the present invention may calculate an information score for each of two name attributes to be compared to get an average information score for the two name attributes. The two name attributes may then be compared against one another to generate a weight between the two attributes. This weight can then be normalized to generate a final weight between the two business name attributes.Type: GrantFiled: January 14, 2010Date of Patent: February 5, 2013Assignee: International Business Machines CorporationInventors: Norm Adams, Scott Ellard, Scott Schumacher
-
Patent number: 8370347Abstract: A system is described for assessing information in natural language contents. A user interface receives an object name as a query term and a value for a customized ranking parameter from a user. A computer storage device stores an object-specific data set related to the object name, wherein the object-specific data set includes a plurality of property names and association-strength values. A computer processing system can count a first frequency of a first property name and count a second frequency of a second property name in a document containing text in a natural language, calculate a relevance score as a function of the first frequency and the second frequency, and rank the plurality of documents using their respective relevance scores, and return one or more documents to the user based on the ranking of the plurality of documents. The function is in part defined by the customized ranking parameter.Type: GrantFiled: February 17, 2012Date of Patent: February 5, 2013Inventor: Guangsheng Zhang
-
Patent number: 8364706Abstract: A system and a method of retrieving information is described. In a system according to the invention, software modules may be used to provide the user with information that is most likely to be the information desired.Type: GrantFiled: June 18, 2004Date of Patent: January 29, 2013Assignee: ZI Corporation of Canada, Inc.Inventor: Todd Garrett Simpson
-
Publication number: 20130007020Abstract: An exemplary embodiment of the present techniques extracts concepts and relationships from a text. Concepts may be generated from the text using singular value decomposition, and ranked based on a term weight and a distance metric. The concepts that are ranked above a particular threshold may be iteratively extracted, and the concepts may be merged to form larger concepts until the generation of concepts has stabilized. Relationships may be generated based on the concepts using singular value decomposition, then ranked based on various metrics. The relationships that are ranked above a particular threshold may be extracted.Type: ApplicationFiled: June 30, 2011Publication date: January 3, 2013Inventors: Sujoy Basu, Sharad Singhal