Incentive Selection of Region-of-Interest and Advertisements for Image Advertising
Techniques for image selection and region of interest analysis are described herein. A pair of two or more users is configured, and an image is displayed to the pair. The image can be a still image (i.e., a picture) or a moving image (i.e., video). In some instances, a plurality of advertisements is suggested for possible association with the image. Input is received from both users in the pair, indicating a positive or a negative association between each advertisement and the image. When the pair positively rates an advertisement, the advertisement is associated with the image. A plurality of regions of interest within the image may be suggested. In response, positive or negative input is received from the pair indicating whether each of the plurality of regions of interest is appropriately suggested for placement of an advertisement.
Latest Microsoft Patents:
- SYSTEMS AND METHODS FOR IMMERSION-COOLED DATACENTERS
- HARDWARE-AWARE GENERATION OF MACHINE LEARNING MODELS
- HANDOFF OF EXECUTING APPLICATION BETWEEN LOCAL AND CLOUD-BASED COMPUTING DEVICES
- Automatic Text Legibility Improvement within Graphic Designs
- BLOCK VECTOR PREDICTION IN VIDEO AND IMAGE CODING/DECODING
Traditional advertising may provide images or video of a product with a background of accompanying images or video. Newer advertising may provide “product placement” directly into video and still images provided to a consumer as entertainment. In both types of advertising, a question arises as to the most advantageous images and/or video into which to place images of the product, advertisement and/or logo of a corporate sponsor. Selection of an advantageous image or video could enhance the success of such advertisements, while a poorly selected image or video could render the advertisements useless. Unfortunately, little is known about how to select an image or video that is appropriate for use with a product, advertisement or corporate logo. Moreover, even less is known about how to categorize large numbers of such images or videos to indicate whether they are appropriate for use with known products, advertisements or corporate logos.
Moreover, the precise location at which to advantageously place a product, advertisement or logo within an image or video is not well understood. If the placement is selected correctly, a viewer's attention will be drawn to the product, advertisement or corporate logo. However, if the placement is selected incorrectly, the viewer may not notice the advertisement because other aspects of the image or video absorb the viewer's attention. Because each image and each video is potentially unique, and because large numbers of products, advertisements and corporate loges exist, new technology to locate advertisements within images and video would be welcome.
SUMMARYTechniques for selecting an advertisement (e.g., a corporate logo) due to its positive association with an image and for identifying regions of interest within the image are described herein. A pair of users is configured, and an image is displayed to the pair. The pair of users may be randomly combined visitors to a same website in some instances. The image can be a still image (i.e., a picture) or a moving image (i.e., video). In some instances, a plurality of advertisements is suggested for possible association with the image. Input is received from both users in the pair. The input can indicate a positive or a negative association between each advertisement and the image. When input from both users is positive with respect to an advertisement, the advertisement may be associated with the image. A plurality of potential regions of interest within the image may be suggested to the pair. In response, positive or negative input is received from each user in the pair indicating whether each of the plurality of regions of interest is appropriately suggested. When the pair agrees that a region of interest (ROI) was appropriately suggested, the image can be annotated to indicate the region of interest.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. The term “techniques,” for instance, may refer to device(s), system(s), method(s) and/or computer-readable instructions as permitted by the context above and throughout the document.
The detailed description is described with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the drawings to reference like features and components. Moreover, the figures are intended to illustrate general concepts, and not to indicate required and/or necessary elements.
The disclosure describes techniques for selecting one or more advertisements (e.g., corporate logos) due to their positive association with an image. Thus, an image may be paired with certain advertisements with which it is compatible. By extension, an image may be found and paired with an advertisement of interest. Additionally, techniques are described for identifying regions of interest within an image.
A pair of users may be configured and an image is displayed to the pair. The pair of users may be randomly combined visitors to a same website. The image can be a still image (i.e., a picture) or a moving image (i.e., video). In some instances, a plurality of advertisements is suggested for possible association with the image. Input is received from the pair, wherein each user indicates a positive or a negative association between each advertisement and the image. When the pair agrees that an advertisement is appropriately associated with the image, the pair may be rewarded and the advertisement and image may be associated in metadata or an appropriate data structure.
A plurality of potential regions of interest within the image may be suggested to the pair. In response, positive or negative input is received from the pair indicating whether each of the plurality of regions of interest is appropriately suggested for location of an advertisement. When the pair agrees that a region of interest was appropriately suggested, the image can be annotated (such as in metadata) to indicate the region of interest.
The discussion herein includes several sections. Each section is intended to be non-limiting. Additionally, this entire description is intended to illustrate components which may be utilized in image and advertisement association, and utilized in region of interest analysis, but not components which are necessarily required. The discussion begins with a section entitled “Example Image Selection and Region of Interest Analysis,” which describes techniques by which advertising and images and/or videos may be assessed for relevance and possible association, and how regions of interest may be located in images. Next, a section entitled “Example Architecture” illustrates and describes a high-level architecture of a system configured for image selection and region of interest analysis. A section, entitled “Example Image Relevance in Advertising Context” illustrates and describes techniques, systems and methods by which one or more advertisements may be evaluated for association with one or more images (or the reverse), and by which an image may be found for association with an advertisement of interest. A section, entitled “Example Region of Interest Analysis” illustrates and describes techniques, systems and methods by which proposed regions of interest within images may be evaluated. Finally, the discussion ends with a brief conclusion.
Throughout this discussion, images and images files are largely discussed in a unified manner; i.e., a discussion of “an image” or “a video” implies a discussion of the data and/or data files upon which such images are based. Additionally, use of the terminology “an image” can be indicative of a still image (e.g., a picture) or a motion image (e.g., a video). This brief introduction, including section titles and corresponding summaries, is provided for the reader's convenience and is not intended to limit the scope of the claims, nor the proceeding sections.
Example Image Selection and Region of Interest AnalysisIn one example, a first user or player 102 and a second user or player 104 are configured as a pair. The configuration includes associating the users so that their output can be compared and jointly processed. By pairing two users, two sources of input are obtained for each issue and/or question. When input from two sources agree—such as when both user in a pair of users answer a question positively—the input may be associated with an enhanced or desired level of veracity or validity. Further, while the following examples, describe the pairing of two users, other implementations may pair together any other number of users (e.g., one, three, ten, one hundred, etc.). Thus, for purposes of this document, a “pair” should be interpreted as a grouping of any number of users. A single user or player may be “paired” with a composite of past users, and input from the single user compared against a composite of past users' responses.
The users 102, 104 are presented with an image 106. The image may be associated with one or more tags 108 or other metadata. The users 102, 104 are also presented with a list or plurality 110 of advertisements 112. In the example of
In operation, the two users 102, 104 are invited to express an opinion as to the relevance of the image 106 to the advertisements 112 in the plurality 110 of advertisements. Thus, each user 102, 104 will review the image 106 and the plurality of advertisements 110, and to positively or negatively respond to each advertisement. In the example of
Accordingly, there is input 120 from the first user 102 and input 122 from the second user 104. In the example of
The image 202 may be obtained at 204 from
The image 202 is processed to include one or more suggested regions of interest. In the example of
An example of a user-annotated image 216 is seen in
Thus,
The video 302 may be obtained at 304 from
The video 302 is processed to include one or more suggested regions of interest. In the example of
An example of a user-annotated image 314 is seen in
A processor 402 (e.g., a microprocessor, controller, CPU or similar device) is in communication with a memory device 404 over a bus 406 or other communications pathway. An additional memory device 408 may be configured for containment of a large number of files, such as images, videos and/or advertisements. A plurality of players 410, may include the players 102, 104 of
The memory device 404 may include a number of software data structures, programs and objects, etc. While memory device 404 is described as a memory device, all or part of device 404 could be implemented as a processing device, such as an application specific integrated circuit (ASIC). Accordingly, while it is convenient to refer to device 404 as being a memory device, other technologies could give functionally similar results, and are within the scope of this disclosure. An operating system 414 may handle traditional OS functionality, plus any enhancements required for a particular system. One or more programs 416 may reside in the memory 404. A player interface 418 communicates with players 410 and may provide a graphical user interface and associated input and output communications. An advertisement, image and video management program 420 may be used to manage an image library 430, a video library 432 and an advertisement library 434, which may be contained on the memory or disk 408.
The system 400 may include a user input evaluation and storage procedure 422 to evaluate and store the input of the users and/or players 410. An advertisement suggestion procedure 424 may be provided, and may be operable to suggest the advertisements for consideration by paired users (e.g., advertisements 110 considered by users 102, 104, as seen in
An image 502 may be identified from within a library. Referring to
An image 504 may include suggested advertisements. The suggested advertisements may be provided by the advertisement suggestion procedure 424 of
An image 506 may include evaluated advertisements. The evaluation may include data associated with the image by means of metadata, database or other data structure. The evaluated advertisements may have been “evaluated” by receipt of opinions from users, such as users 102 and 104 of
An image 508 may include data (e.g., metadata) indicating the matches obtained by user input. Such matches—indicating that two or more users agreed that the advertisements and images are correctly associated—include greater weight and/or veracity than simple evaluation by a single user. Accordingly, an image 508 with associated match data is more useful in advertising projects, in that a positive association has been made between the image and the advertisement. In one example, the data obtained from user input can be managed by the user input evaluation and storage procedure 422 of
An image 510 may be classified as having match data associated with one or more advertisements. Such an image 510 may be of utility in advertisement for one or more corporations and/or products. In some implementations of the system 400 (of
An image 512 may be classified according to a desired advertisement. However, because a correct position within the image for an advertisement has not been determined, the image 512 may include or be associated with data (e.g., metadata) indicating suggested regions of interest (i.e., locations at which an advertisement may be placed). Images 206 (
An image 514 may include evaluated regions of interest. The evaluation may include data associated with the image by means of metadata, database or other data structure. The evaluated regions of interest may have been “evaluated” by receipt of opinions from users. Accordingly, images 216 (still pictures,
An image 516 may include data (e.g., metadata) indicating the matches obtained by user input. Such matches—indicating that two or more users agreed that the region(s) of interest within the image were correctly placed—include greater weight and/or veracity than simple evaluation by a single user. Accordingly, an image 516 with associated match data is more useful in advertising projects, in that regions of interest within the image have been identified.
An image 518 may include data (e.g., metadata) indicating (by calculated user matches) a positive association with one or more advertisers and/or advertisements and (also by calculated user matches) confirmed regions of interest. Accordingly, image 518 may have an advertisement—to which it was positively associated by matching user input—located in a region of interest, which is positively confirmed by matching user input.
Example Image Relevance in Advertising ContextAt block 602, a pair of two or more online users is configured. The pair can be configured from among users concurrently visiting a website. Therefore, the pair of users may be configured from two users located in distinct locations on a network, such as the Internet. Where only a single user is available, and a pair for that user is unavailable, the single user may be paired to a composite of past users. The composite of past users may be an average, typical or most common response of the past users. At block 604, an image is displayed to the pair of users. In the example of
At block 606, a plurality of advertisements is suggested to the pair of users. In the example of
At block 608, the pair of users is asked to “rate” the association and/or correlation between the image and each of the plurality of advertisements. This can be a positive or negative feedback. At block 610, input is received from the pair regarding correlation of each of the plurality of advertisements to the image or video. Positive feedback from both users in the pair can result in a positive association between an image and an advertisement. Accordingly, the advertisement is commonly selected for association with the image. In the example of
At block 612, the pair of users is rewarded for common input, and particularly for common positive input wherein the pair both selected an advertisement or logo as associating and/or correlating with the displayed image. In the example of
At block 614, the process of finding an image with a positive correlation and/or association to plural users (e.g., a pair of users) may be repeated as desired. Repetition may be desired to associate or correlate each of a group of images with one or more advertisements or advertisers, or to associate or correlate one or more advertisements with one or more images. In some instances, there may be an advertisement of interest for which an associated and/or correlated image is desired. In this example, the process 600 may be repeated until an appropriate image is discovered for the advertisement of interest.
At block 702, a user interface, such as a web page, is obtained. The web page or other user interface may be a source of images. At block 704, the web page (described here as the example user interface) is segmented into several blocks. The blocks are formed so that text in each block includes consistencies. The consistencies may be determined by any appropriate means, such as relevancy matching. For example, the textual relevance procedure 426 of
At block 906, regions of interest within the image or video are suggested to the pair of online users 102, 104. Referring to
At block 908, input from the pair of users regarding the suggested regions of interest is received. The input will be positive or negative, such as the positive input 218, 316 (
At block 912, the image or video is associated with the commonly (both of the pair of users) selected advertisements by locating the advertisements in the commonly selected regions of interest. For example, where both users associated a particular advertisement with a particular image, that advertisement could be associated with the image and/or the image would be associated with the advertisement. Additionally, where both users selected a particular region of interest in the image, the region of interest could be associated with the image and the advertisement could be inserted into that region of interest.
At block 1004, at least one attended object is extracted from the contrast information, i.e., from the Ci,j, which can represent contrast information accumulated from each pixel in the image. In one example, the extraction may be performed by converting gray-levels of the contrast information to define attended areas and unattended areas within the image.
At block 1006, at least one of the attended objects is framed as an attended view. The attended view may be at a center of attention of the image, an attended area or an attended point. Examples of a framed view include the suggested regions of interest are seen at 208-214 of FIGS. 2 and 310-312 of
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the claims.
Claims
1. One or more computer-readable media storing computer-executable instructions that, when executed, cause one or more processors to perform acts comprising:
- configuring a pair of two or more users;
- causing display of an image to at least one user of the pair of users;
- suggesting a plurality of advertisements for possible association with the image;
- receiving positive or negative input from each user of the pair of users indicating whether each of the plurality of advertisements positively associates with the image;
- suggesting a plurality of regions of interest within the image; and
- receiving positive or negative input from each user of the pair of users indicating whether each of the plurality of regions of interest is appropriately suggested.
2. One or more computer-readable media as recited in claim 1, wherein configuring the pair of users comprises pairing a new user to a composite of past users.
3. One or more computer-readable media as recited in claim 1, wherein receiving positive or negative input from each user of the pair of users indicating whether each of the plurality of regions of interest is appropriately suggested comprises receiving input indicating whether a suggested region of interest is appropriate for placement of an advertisement.
4. One or more computer-readable media as recited in claim 1, wherein causing display of the image to the at least one user of the pair of users comprises displaying the image as a moving image.
5. One or more computer-readable media as recited in claim 1, wherein causing display of the image to the at least one user of the pair of users comprises:
- obtaining a web page;
- segmenting the web page into multiple blocks, wherein the multiple blocks are formed in part by grouping text having consistencies; and
- selecting the image from at least one of the multiple blocks.
6. One or more computer-readable media as recited in claim 1, wherein suggesting a plurality of advertisements comprises:
- ranking advertisements according to global and local textual relevance between the advertisements and content of the web page;
- identifying candidate advertisement insertion positions based at least in part on an average of normalized energies of pixels within blocks of a grid superimposed over the image, and based at least in part on a weight applied to each block of the grid to emphasize ad insertion positions on sides and in corners of the grid; and
- re-ranking advertisements based at least in part on visual similarity of each advertisement and each of the candidate advertisement insertion positions.
7. One or more computer-readable media as recited in claim 1, wherein suggesting the plurality of regions of interest within the image comprises:
- generating a saliency map for the image; and
- extracting attended areas or objects from the saliency map.
8. One or more computer-readable media as recited in claim 1, wherein receiving positive or negative input from each user of the pair of users indicating whether each of the plurality of advertisements positively associates with the image comprises a repetitive process wherein a plurality of images are displayed until the pair of users agrees that a displayed image is associated with an advertisement of interest.
9. One or more computer-readable media as recited in claim 1, wherein suggesting the plurality of regions of interest within the image comprises:
- generating a saliency map for the image;
- extracting attended areas or objects from the saliency map; and
- supplementing extracted attended areas with input derived from a facial recognition algorithm applied to the image.
10. One or more computer-readable media as recited in claim 1, additionally comprising annotating the image with a commonly selected advertisement located in a commonly selected region of interest.
11. A method of image selection and region of interest analysis, comprising:
- storing, in a memory communicatively coupled to a processor, computer-executable instructions for performing the method;
- executing the instructions on the processor;
- according to the instructions being executed:
- configuring a pair of two or more users, each of the pair located in a distinct location on a network;
- displaying an image to at least one user of the pair;
- suggesting a plurality of advertisements to accompany the image;
- receiving positive or negative input from each user of the pair indicating whether each of the plurality of advertisements positively associates with the image;
- suggesting a plurality of regions of interest within the image, the suggesting comprising: mapping each pixel in the image to express contrast information, the contrast information based at least in part on a neighborhood about the pixel; extracting at least one attended object from the contrast information, wherein the extraction converts gray-levels of the contrast information to define attended areas and unattended areas; framing the at least one attended object as an attended view, an attended area or an attended point; and
- receiving positive or negative input from each user of the pair indicating whether each of the plurality of regions of interest is appropriately suggested for placement of an advertisement.
12. The method of claim 11, wherein displaying an image to the pair comprises:
- obtaining a web page;
- segmenting the web page into multiple blocks, wherein the multiple blocks are formed in part by grouping text having consistencies; and
- selecting the image from one of the multiple blocks.
13. The method of claim 11, wherein suggesting a plurality of advertisements comprises:
- ranking advertisements according to global and local textual relevance between the advertisements and content of the web page;
- identifying candidate advertisement insertion positions based at least in part on an average of normalized energies of pixels within blocks of a grid superimposed over the image, and based at least in part on a weight applied to each block of the grid to emphasize ad insertion positions on sides and in corners of the grid; and
- re-ranking advertisements based at least in part on visual similarity of each advertisement and each of the candidate advertisement insertion positions.
14. The method of claim 11, wherein receiving positive or negative input from each user of the pair indicating whether each of the plurality of advertisements positively associates with the image comprises a repetitive process wherein a plurality of images are displayed until the pair agrees that a displayed image is associated with an advertisement of interest.
15. The method of claim 11, wherein mapping each pixel in the image to express contrast information comprises:
- mapping each pixel in the image to express contrast of that pixel in part according to Ci,j=Σqεθd(pi,j, q), wherein the Ci,j are contrast of pixels pi,j normalized as [0, 255] in an M×N array of pixels representing the image, points q are in a neighborhood θ about pi,j, wherein size of θ can be used to regulate sensitivity of the contrast information, and d is a Gaussian distance function.
16. The method of claim 11, additionally comprising annotating the image to indicate suggested regions of interest for which each user of the pair of users gave positive input.
17. One or more computer-readable media storing computer-executable instructions that, when executed, cause one or more processors to perform acts comprising:
- configuring a pair of two or more users to allow comparison of input from the users;
- displaying an image to the pair;
- suggesting a plurality of advertisements to accompany the image, the suggesting comprising: ranking advertisements according to global and local textual relevance between the advertisements and content of the web page; identifying candidate advertisement insertion positions based at least in part on an average of normalized energies of pixels within blocks of a grid superimposed over the image, and based at least in part on a weight applied to each block of the grid to emphasize ad insertion positions on sides and in corners of the grid; and re-ranking advertisements based at least in part on visual similarity of each advertisement and each of the candidate advertisement insertion positions;
- receiving positive or negative input from each user of the pair indicating whether each of the plurality of advertisements positively associates with the image;
- suggesting a plurality of regions of interest within the image; and
- receiving positive or negative input from each user of the pair indicating whether each of the plurality of regions of interest is appropriately suggested for placement of an advertisement.
18. One or more computer-readable media as recited in claim 17, wherein configuring the pair of users comprises pairing a new user for input comparison to a composite of past users.
19. One or more computer-readable media as recited in claim 17, wherein displaying an image to the pair comprises:
- obtaining a web page;
- segmenting the web page into multiple blocks, wherein the multiple blocks are formed in part by grouping text having consistencies; and
- selecting the image from one of the multiple blocks.
20. One or more computer-readable media as recited in claim 17, wherein suggesting the plurality of regions of interest within the image comprises:
- generating a saliency map for the image;
- extracting attended areas or objects from the saliency map; and
- supplementing extracted attended areas with input derived from facial recognition.
Type: Application
Filed: Oct 18, 2010
Publication Date: Apr 19, 2012
Applicant: Microsoft Corporation (Redmond, WA)
Inventors: Tao Mei (Beijing), Xian-Sheng Hua (Beijing), Shipeng Li (Palo Alto, CA)
Application Number: 12/906,899
International Classification: G06Q 30/00 (20060101);