TARGETED E-COMMERCE SYSTEM
A system for displaying products for purchase on any document displayed on any computer device, comprising conducting a statistical frequency analysis of the word occurrence in the document to determine the primary subject matter of the document and/or keywords in the document, selecting products which are relevant to the subject matter and keywords of the document, either by manual selection or by automatic selection. Automatic selection is accomplished by conducting a statistical frequency analysis of the word occurrence in the product descriptions to determine the keywords in the product descriptions and matching the keywords from the document with the keywords from the product descriptions.
Latest Yahoo Patents:
- Systems and methods for augmenting real-time electronic bidding data with auxiliary electronic data
- Debiasing training data based upon information seeking behaviors
- Coalition network identification using charges assigned to particles
- Systems and methods for processing electronic content
- Method and system for detecting data bucket inconsistencies for A/B experimentation
This application is a continuation application of Ser. No. 09/738,471, filed Dec. 15, 2000, which is a continuation-in-part of Ser. No. 09/703,006 filed on Oct. 31, 2000, now U.S. Pat. No. 6,546,386, which is a Continuation-in-part of Ser. No. 09/630,227 filed Aug. 8, 2000, now abandoned.
FIELD OF INVENTIONThe invention relates to providing a system for providing a list of related e-commerce products to any document appearing on a computer device.
BACKGROUND OF THE INVENTIONThe Internet delivers trillions of words to billions of screens. The Net contains an enormous amount of material. Systems exist which survey web sites to determine the traffic (hits) that take place for a given web page. Some systems exist which will suggest products to a user based upon their selection of a particular web page or a particular topic. No system exists however, to analyze survey statistics to select and suggest e-commerce products that would be of interest to those persons selecting particular and portions of that content, to suggest highly related products that can be purchased by the user. This would be highly beneficial to create a market for those products to persons that are already recognized to be interested in a particular subject.
OBJECTS OF THE INVENTIONIt is an object of this invention to determine the interests of users of the Internet and suggest e-commerce products for purchase, relevant to the user's interests.
It is a further object of the invention to provide a system to determine highly relevant products to suggest to Internet users based upon their interest in or queries based upon any content, such as magazine articles, news stories or any other text.
SUMMARY OF INVENTIONThe targeted e-commerce system of this invention uses a product database and an automated keyword tagging system to determine products that are best suited for display on a given content page. As the content is being prepared for publishing on a web site, it is tagged with one or more preselected keywords either manually or using an auto metatagging process as described in the parent applications on “Hotwording” and “Brilliant Queries”. These keywords are submitted as input to a product database, which selects one or more products for display on the page. The products are displayed either as text entries, used titled thumbnails or other conventional techniques on product merchandising on electronic devices. The text entry, thumbnail, or other method is essentially a link that leads to a product display page on the site or on a separate retail site.
The product database contains fields that describe the products along with their associated keywords. These fields might include the internal product id, the product url, the product price, the product title, the sale price (if any), and the full product description.
Products are associated with one or more keywords, which are manually or automatically selected. An example of keywords associated with, for instance, “Basketball”, would be “clothing” or “celebrity”. All product keywords are listed in a product keyword table from the content selection, to limit the product display to only products associated with specific keywords or multiple keyword clusters.
Product selection can be done statically for simple publishing content or the product selection can be performed on a server dynamically, to make products instantly available for display on the web site.
With the static publishing model, the product selection is done at the time the content is submitted for publishing and the chosen products are embedded in the page prior to upload to the server. These pages can be reprocessed and uploaded to the server repeatedly as the product database changes.
The dynamic publishing model relies on a product database that is running live on the server and gateway interface technology to query the product database and generate product selection on a query by query basis. This model allows for a product selection that is instantly updated on every content page as the product database updated. If the product database has been enhanced via the above-described system of metatagging, then the match can be undertaken via sophisticated statistical matching techniques. If the product database is maintained without such enhancement, but is capable of responding to Boolean ANDed queries, then the matching can be accomplished through the automated submission of Boolean ANDed queries derived from the metatagging process, as described in the “Brilliant Query U.S. Pat. No. 6,546,386.
Traffic survey data of web content may be done on several levels. These include:
Content selection, the general nature of the content.
Use of “Hotwording” as described the parent application, which would provide data on both the content and keywording.
Use of “Brilliant Query” as described in the parent application, which would provide data based on a “hook” and a keyword.
Data from a targeted e-commerce site to which metatagging or keywords have been added with the methodologies described in this application, and its precursors.
The data can be analyzed by researchers looking at the word frequency analysis of the content and keywords chosen and then choosing products that are relevant to the subject matter. Products can also be chosen by adding metatagging or keywording to an inventory of products, and matching the content keywording to the product keywording.
As discussed in the “Brilliant Query” application, more definitive results will be obtained if keyword clusters, such as a doublet (as two word combination) or a triplet (three word combination) are analyzed, as they are far more revealing than a single word or general content category.
There will be cases where the number of products displayed on a given page will need to be limited. When the number of products matching the keywords for a given page exceeds this limit, there are one or more criteria for determining the products that are displayed. First, all products that are associated with all keywords are selected as a set. The product list is then ordered by one or more of the following optional criteria:
Date of product listing (how new is the product).
Product sale status (is there a special sale on the product?).
Product Price (more expensive or less expensive).
Keyword weighting (a product that matches the keyword that most appears in the content will be displayed before a product that matches a lower frequency keyword).
The list is then limited to a prespecified number of products based on the site's design preferences.
As stated in the parent application, Brilliant queries are a collection of one or more pairings of a hook and a keyword. For example, an article on Basketball might have the following Brilliant queries:
Search for more information on BASKETBALL and CELEBRITY
Search for more information on BASKETBALL and CLOTHING
Search for more information on BASKETBALL and FASHION
Search for more information on BASKETBALL and COMMERCE
The hook is BASKETBALL and the keywords are CELEBRITY, CLOTHING, FASHION and COMMERCE.
Metatagged keywords are simply a collection of words, generated automatically or manually, that are deemed to be indicative of the topic matter or one of the topics for a given content selection. Metatagged keywords are determined by comparison of a pre-determined list of keywords to the text of the content selection. If the content selection contains one or more of the keywords, or an appropriate synonym, then that keyword is associated with that text body and potentially used for the Brilliant Query. Keywords may also be determined by statistical word frequency analysis of the text, with or without manual selection and addition of synonyms.
Keywords are generated by automatic or manual statistical and empirical analysis of the body of content to be enhanced or a comparable body of content. The keyword list for a given content source is generated through the use of word frequency analysis, stopword removal finally, manual selection using empirical testing of the results generated by a given potential keyword. Based on experience, a solid keyword list usually runs between 250 and 1000 words and phrases, which are chosen by the system designer.
Also, keywords can be manually tuned through the use of a thesaurus feature whereby a given keyword can be associated with one or more synonyms that would indicate the use of the keyword whenever one or more of the synonyms appear in the body of text to be enhanced. This process is applied to both the content and the product database.
As further illustrated,
Thus, when the consumer/reader selects a particular article and a particular query from the text, a list of products available for purchase appears on the screen, the products being highly related to the query. Each of the products listed may contain a link to a retail purchase site.
Claims
1. An electronic method of including additional content to a web document, the method comprising:
- electronically determining content of the web document; and
- selecting advertising content associated with the detected content.
2. The method of claim 1 further comprising:
- selecting at least one of the advertising content associated with the detected content; and
- generating the web document to include the at least one advertising content therewith.
3. The method of claim 1, wherein electronically determining content of the web document includes analyzing the web document to detect a context for the web document.
4. The method of claim 3, wherein detecting the context includes using a statistical analysis of the web document.
5. The method of claim 3, wherein the web document is a web page.
6. The method of claim 1, wherein selecting the advertising content includes selecting at least one file representing at least one product related to the determined content.
7. The method of claim 1, wherein the content of the web document is determined based on a plurality of keywords.
8. The method of claim 7, wherein the keywords are preselected.
9. The method of claim 7, wherein determining the content of the web document includes tagging the web document with the keywords.
10. The method of claim 9, wherein the tagging of keywords is done manually.
11. The method of claim 9, wherein the tagging of keywords is done performed using an automated tagging process.
12. The method of claim 7 wherein selecting the advertising content includes selecting at least one file representing at least one product related to the determined content, the method further comprising:
- accessing a product database using at least one of the keywords.
13. The method of claim 7 wherein the content of the web document is determined based on a keyword cluster, where a keyword cluster includes two or more keywords grouped together.
14. The method of claim 1 further comprising:
- determining a limit on the number of advertising content usable with the detect content of the web document.
15. The method of claim 14 further comprising:
- determining the limit based on at least one of: date of a product listing; a product sale status; and a product price.
16. The method of claim 14 further comprising:
- determining the limit based on a keyword weighting.
17. The method of claim 1, wherein the content of the web document is determined based on a plurality of keywords, the plurality of keywords being from a keyword list, the keyword list generated based on word frequency analysis, stopword removal and manual selection using empirical testing.
18. The method of claim 17 further comprising:
- tuning the keyword list based on the use of a plurality of synonyms.
19. The method of claim 1, wherein the advertising content includes an active link in the form of a uniform resource locator to a product.
20. The method of claim 1, wherein the content of the web document includes the context of a web document.
Type: Application
Filed: Mar 12, 2010
Publication Date: Aug 19, 2010
Applicant: Yahoo! Inc. (Sunnyvale, CA)
Inventors: Peter M. Black (Pacifico Palisades, CA), Anthony Bryan Waters (Rowlett, TX)
Application Number: 12/723,071
International Classification: G06Q 30/00 (20060101); G06F 17/00 (20060101); G06F 17/30 (20060101);