Abstract: Methods of quantifying a reputation for a data source are presented. Historical documents having opinions and that are attributed to a data source are identified. The opinions preferably are quantifiable and can be converted into a predication. As the predications are verified, the data source is assigned one or more predication scores indicating the accuracy of the predications. A reputation score for a new document having a new predication can be assigned to the data source as a function of the predictions scores from the historical documents, data source affiliations, document topics, or other parameters. The reputation score relating to the new document can be presented to a user via a computer interface as a single-value, or multiple values corresponding to different topics.
Abstract: Methods of identifying web documents as relating to a market domain that would ordinarily be considered unrelated are presented. Market domain criteria can be defined that provide for classifying web documents as being related to the domain. The documents classified as related to the market domain form a training sample of documents used to establish correlations among brand term combinations found within the documents. If correlations are established among the terms in a combination, the term combinations can be assigned a similarity score indicating how similar the terms are considered to be. The term combinations can be used to search for additional web documents that could pertain to the market domain but would otherwise fail to satisfy the market domain criteria. The search results can be presented via a computer interface according to similarity scores.
Abstract: Methods for providing marketing analytics are presented. Information about a brand is extracted from web documents using a search program. The search program learns about how a brand is referenced from the context of one or more web documents having quality, quantity, or entity brand characteristics. After learning about the brand, the program extracts information from additional web documents especially those having the quality, quantity, and entity characteristics. As the program analyzes the documents, it stores the extracted information in a database to build a statically significant data set.
Abstract: Methods for deriving a brand sentiment are presented. Phrases and ratings associated with the brand are stored in a database. The phrases are analyzed and compared to each other and to the ratings to derive a statistical significance of a phrase usage relative to other phrases. A sentiment score is derive from the statistical significance.