SYSTEM AND METHOD FOR PROVIDING REAL TIME TARGETED RATING TO ENABLE CONTENT PLACEMENT FOR VIDEO AUDIENCES
A system and method is provided for providing real time targeted rating to enable content placement for video audiences. The method includes determining if at least one set top box, located within a network having at least one set top box, is on or off, wherein being on is defined as a set top box having a zapping event occur within a predefined time period; determining what one or more viewer profiles are currently consuming content provided by a set top box within the network, wherein currently consuming refers to consuming within the predefined period; and determining targeted rating per a viewer profile that had been identified as currently consuming content via at least one of the set top boxes within the network.
Latest Ads-Vantage, Ltd. Patents:
This application claims priority to copending U.S. Provisional Application entitled, “SYSTEM AND METHOD FOR PROVIDING PERSONAL ADVERTISEMENTS FOR AN ACCESS NETWORK,” having Ser. No. 60/956,728, filed Aug. 20, 2007, which is entirely incorporated herein by reference.
FIELD OF THE INVENTIONThe present invention relates to advertising, and more particularly is related to providing personal advertisement to video services.
BACKGROUND OF THE INVENTIONOwners of products and services, also referred to herein as advertisers, spend significant funds advertising on television. In addition, advertisers seek to maximize return from their investment in advertising on television by using different techniques. As an example, owners may pay to have an advertisement run at a specific time on a specific channel. Such an advertisement may not only be for products and services, but for any content, such as, but not limited to, video on demand, gaming, and any other content or service. In addition, owners may pay a premium price to have their advertisement run during the showing of popular television programming.
Unfortunately, advertisers do not have control over who may be watching television at a time that an advertisement is run. As a result, funds associated with television advertising are not maximized. Instead, after receiving ratings associated with an aired television show, advertisers pay based upon a previously desired audience and an agreed upon percentage. Funds would be better allocated if a larger number of a specific desired audience could be selected for viewing of targeted advertisements.
Different techniques have been used in an attempt to maximize television advertising investments. Examples of known techniques include attempting to obtain demographic and psychographic profiles, and using information about rating. Unfortunately, information about rating, demographic and psychographic profiles, and targeted rating is obtained using surveys and/or people meters, which are based on small sample audiences and are inaccurate in the collection process. Advertisers, network management, and cable/satellite decision makers would like to use more accurate information for placement and pricing of television advertisements.
Currently, the process of creating television viewer profiles has not made use of the actual actions of the television viewers while watching television. Utilizing information associated with viewer actions while watching television would be very useful in the creating of television viewer profiles. In addition, it would be beneficial to be determine viewer profiles that consumed content.
Thus, a heretofore unaddressed need exists in the industry to address the aforementioned deficiencies and inadequacies.
SUMMARY OF THE INVENTIONEmbodiments of the present invention provide a system and method for providing real time targeted rating to enable content placement for video audiences. Briefly described, in architecture, one embodiment of the system, among others, can be implemented as follows. The system contains a head end having a computer and means for communicating therein, wherein the computer has a management application stored therein, and wherein the management application further comprises: logic configured to determining if at least one set top box, located within a network having at least one set top box, is on or off, wherein being on is defined as a set top box having a zapping event occur within a predefined time period; determining what one or more viewer profiles are currently consuming content provided by a set top box within the network, wherein currently consuming refers to consuming within the predefined period; and determining targeted rating per a viewer profile that had been identified as currently consuming content via at least one of the set top boxes within the network.
The present invention can also be viewed as providing methods for providing real time targeted rating to enable content placement for video audiences associating content to at least one viewer profile in video audiences. In this regard, one embodiment of such a method, among others, can be broadly summarized by the following steps: determining if at least one set top box, located within a network having at least one set top box, is on or off, wherein being on is defined as a set top box having a zapping event occur within a predefined time period; determining what one or more viewer profiles are currently consuming content provided by a set top box within the network, wherein currently consuming refers to consuming within the predefined period; and determining targeted rating per a viewer profile that had been identified as currently consuming content via at least one of the set top boxes within the network.
Other systems, methods, features, and advantages of the present invention will be or become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description, be within the scope of the present invention, and be protected by the accompanying claims.
Many aspects of the invention can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present invention. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.
The present system is capable of learning the viewing habits of video viewers by collecting zapping events and other events performed by the viewer. Such videos may be viewed via a television, hand held device, computer, or any device capable of displaying video. The events may be collected at a set top box, computer, or other device. Alternatively, the events may be collected at a different location, such as, but not limited to, at an access multiplexer located in a head end, or in a device located separate from the head end. The system learns the viewing habits and zapping habits of different population profiles by identifying the viewing profile of a household.
The system uses supervised or unsupervised learning functionality for identifying different population profiles, and provides a representation of the probability (or another form of representation) of each population profile to watch any given program and to present a zapping pattern. The probabilities can be utilized as a tool for advertisers searching for the demographic profile of the audience of a television program, or, using inference functionality described herein, to identify the home audience at each household, and the specific viewers of a television program. Thereafter, the system is capable of supplying personalized content, such as, but not limited to, advertisements, video selections, and other content, to the viewers. It should be noted that the following description provides an example in which the content is an advertisement, however, the invention is not intended to be limited to advertisements, but instead, any content that may be personalized.
The present system collects the operations performed by viewers at service decoders, such as, but not limited to, set top boxes (the term set top box is used hereafter). The system then employs unsupervised or supervised learning functionality, as described herein, to interpret the operations at each set top box as the sum of operations of all viewers associated with this set top box. The system learns to identify different viewer profiles in the population and associates with each set top box and profile a probabilistic model of the viewing and zapping habits of viewers.
It should be noted that the present system and method may be provided within different infrastructures. As an example, the following description provides examples of using the present system and method in an Internet protocol television (IPTV) infrastructure, in a cable infrastructure, and in a satellite infrastructure. While these infrastructures are described herein, the present system and method is not intended to be limited to these infrastructures.
While the following describes the present system and method in detail it is beneficial to provide certain definitions.
Set top box (STB) or service decoder: A set top box or service decoder is a device responsible for converting digital (or analog) content received into viewable content that may be fed into a television set or other monitor. The set top box or service decoder may be located at a household or another location.
Platform: A network of service decoders (e.g., set top boxes) of a specific television service provider.
Passive audience identification: Identification of the viewer's profiles without any specific actions performed by the viewer.
Zapping event: A zapping event is an event where there is switching from a current service to another service, where the switching is performed by, for example, but not limited to, use of a remote control, pushing buttons on the set top box, or any action that causes switching, including, but not limited to, voice commands, or even consumer motions without pressing buttons. In addition, a zapping event may be other means for communicating with a set top box, such as, but not limited to, pressing an electronic program guide, pressing a volume button, and other actions involving the set top box.
Zapping pattern: A zapping pattern is the behavior of a viewing individual in terms of zapping, such as, but not limited to, programs watched, frequency of zapping events, and variance of zapping frequency.
Set top box (STB) zapping signature: Records of zapping events of a particular set top box.
Set top box (STB) signature: Data model providing characteristics of a set top box including: an association between a set top box and content available to the set top box, where the content is either provided or not provided via the set top box during a time period; and/or, at least one zapping pattern associated with the set top box. It should be noted that herein when referring to set top box signatures, one or more set top box signature is included. In addition, content availability refers to content that the set top box has access to and can provide.
Zapping log: Records of the set top box zapping signatures for an entire set top box network (Platform) or for part of the network.
Channel: A stream of programs broadcasted consecutively from a content source.
Program: Content that was broadcasted on a specific channel at a specific date and time, whether on demand or generally broadcasted.
Program rating: Percent of viewers that watched the program.
Targeted program rating: Percent of viewers of specific profile that watched the program.
Channel rating: Percent of viewers that watched the channel during the specified time period.
Targeted channel rating: Percent of viewers of specific Profile that watched the channel during the specified time period.
Profile: The classification of an individual into one of several population groups that is targeted. Such profiles may be, for example, but not limited to, psychographic (for example, behavioral) or demographic profiles. Examples of such groups include, but are not limited to, gender, age, income, marital status, and possibly also by interests in different fields.
Learning functionality: Functionality used to reduce a large set of observed data and its classification into groups to a set of parameters, allowing to reconstruct the classification of the majority of the original data and to classify similar, unlearned, data, or, to produce a new type of classification. Different relevant learning methods may be utilized to provide the learning functionality such as, but not limited to, artificial neural networks, decision trees, k-Nearest Neighbor, Quadratic classifier, support vector machine, direct probability estimate using Bayesian inference, Bayesian networks, Gaussian estimators, least squares optimization methods, and other optimization methods.
Supervised learning: Supervised learning is learning in which the classification of the observed data is inferred from a sample of the data supplied by an outside source. The learning functionality searches for a parameter set allowing reconstruction of the classification from the input that later can be used for classification of new unlearned data.
Unsupervised learning: Unsupervised learning is learning in which no classification of observed data is given (i.e., no sample is provided), and the functionality attempts to classify the data into different classes under some constraints. The functionality may use a method, such as, but not limited to, vector quantization, and various learning methods and various optimization methods, to find a reduction of the data into representative classes.
The head end 20 contains at least a video service splicer 30, an advertisements video server 40, a management application 50, and an access network multiplexer 60. One having ordinary skill in the art would appreciate that the head end 20 may have portions in addition to those mentioned herein. In addition, while the present description refers to a management application, it should be noted that the management application is stored on a computer.
The video service splicer 30 receives video and audio services from a satellite dish 70. It should, however, be noted that video and audio services may be received by devices other than a satellite dish 70, such as, but not limited to, a cable network or any device capable of providing video to the head end 20.
The video service splicer 30 is capable of splicing personal advertisements into a video service stream, as instructed by the management application 50 and as is further described in detail hereinbelow. The video service splicer 30 also receives advertisements from the advertisements video server 40. In addition, actions of the video service splicer 30 are controlled by the management application 50. It should be noted that, for the example of an IPTV network, the video packets received by the video service splicer 30 may carry an Internet protocol (IP) address and a User Datagram Protocol (UDP) port number. It should also be noted that the video service splicer 30 may instead receive video and audio services from a cable fiber.
The access network multiplexer 60 is responsible for routing video services to transmission units 120A-120D that are video services decoders, as explained hereinbelow. The transmission units 120 are each located within a customer premises 100A-100D. The access multiplexer 60 is connected to both the management application 50 and the video service splicer 30. Specifically, the access network multiplexer 60 may perform, for example, IP and UDP port manipulation. It should be noted that the access network multiplexer 60 may be, for example, but not limited to, an optic multiplexer or a digital subscriber line access multiplexer (DSLAM). From a multicast point of view, as described hereinbelow, connection between the access network multiplexer 60 and a set top box 110 may be a shared media connection, or any other type of connection, and there may or may not be a multicast hierarchy between the access network multiplexer 60 and the set top box 110.
The management application 50 communicates with the video service splicer 30, the advertisements video server 40, and the access network multiplexer 60. In addition, the management application 50 provides the functionality required to learn unsupervised profiles in television audiences, as is described in detail hereinbelow. It should be noted that in accordance with an alternative embodiment of the invention, the management application 50 may instead be located within a set top box 110 located within the customer premises 100A-100D.
Each customer premises 100A-100D at least contains a set top box 110A-110D and a transmission unit 120A-120D. While for exemplary purposes four customer premises 100A-100D are illustrated, one having ordinary skill in the art would appreciate that additional or fewer customer premises 100A-100D may be provided. The transmission unit 120 is capable of receiving advertisement streams and video streams and forwarding the streams to an appropriate set top box 110. For exemplary purposes, the customer premises 100A-100D is illustrated as also containing a computer 130A-130D, although a computer 130 is not intricate to the invention. It should be noted that while a single set top box is shown as being located within a customer premises 100, more than one set top box 110 may be located within the customer premises 100. In addition, in accordance with an alternative embodiment of the invention, the set top box may be a computer or any device that can decode a service. For the present example of an IPTV network, the set top box 110 receives a video service with certain TCP/IP parameters, such as, but not limited to, IP address and UDP port. It should be noted, however, that in a cable network or a satellite network, the set top box 110 may or may not receive TCP/IP parameters.
The present system enables editing of online personal video so as to provide personalized television advertisements directed toward a viewer presently watching the television. As is described in detail below, the present invention is capable of categorizing a viewer into an advertising profile, an example of which is, but in not limited to, a demographic profile. Within a single customer premises, different television viewers may have different profiles. The different television viewers may view the same television during the day. Each different viewer may be associated with a different advertising profile, such as, but not limited to a demographic profile, thus preferably receiving different advertising messages. As an example, a family structure may be described as having an adult male of age 45, an adult female of age 42, a male teenager of age 17, a female teenager of age 14, and a male child of age 7. It should be noted that while the present description refers to a demographic profile, other types of profiles may be provided for.
During the time that a television viewer consumes service transmissions the management application 50 identifies the profile of the viewer. After identifying the profile, the application 50 performs personalized advertisements editing for that particular profile. When there is a different viewer with a different advertising profile that is using the same video decoder, the management application 50 identifies the profile that the viewer belongs to and performs online personalization editing for the advertisements, as described below.
In accordance with the present invention, for both supervised and unsupervised learning, the television consumers, also referred to herein as viewers, are not individually identifying themselves to the system. As a result, the system is required to identify consumer profiles and to associate the profiles with a specific set top box. This process is described in detail hereinbelow. Prior to describing this process, a general process of IPTV advertisement insertion in a broadcast environment is described in detail.
A typical advertisement projection works as follows. During content consumption the access network multiplexer 60 receives a video signal and sends the video signal to the customer premises 100A-100D using an IP protocol. During an advertisement break the video transmissions continue to be transmitted in multicast, thus there is no personalization of advertisements. To instead personalize advertisements, the following is performed.
As shown by block 202, content is transmitted from the head end 20, via the access network multiplexer 60, to the set top box 110. An example of a protocol that may be used for the transmission is the Internet group management protocol (IGMP), which is used by IP hosts to manage their dynamic multicast group membership. Of course, other protocols may be used.
In accordance with the present example, a subset, or complete set, of the customers that are connected to the access network multiplexer 60 are viewing the same video and/or audio service (i.e., content). The management application 50 also continuously identifies the consumers (block 204). It should be noted that the management application 50 can utilize either online processing or offline processing to determine a relationship between viewed content (e.g., videos) and viewer profiles. Regarding offline processing to identify consumers, associate the consumers with content, and produce reports, in accordance with a predefined schedule, or when prompted to do so, the management application 50 reviews zapping patterns, processes the patterns, and associates each program viewed from a set top box 110 with a viewer profile. Alternatively, for online processing, during an advertising break, the management application 50 reviews only recent zapping events to determine which viewer is presently viewing content. Further description of consumer identification is provided with regard to
Returning to the flowchart 200 of
As shown by block 208, the video splicer 30 then splices the advertisements according to the decision of block 206. Since one having ordinary skill in the art would know how a video splicer splices advertisements, further description of the splicing process is not provided herein. As shown by block 210, when the advertisement break is over, the access multiplexer 60 continues to transmit the multicast transmission as it did prior to the advertisement break.
It should be noted that if during an advertisement break the consumer changes the consumed video service, the management application 50 supplies the new service in the same manner. Specifically, if the service transmits content, the management application 50 continues to transmit the content with the multicast protocol. In addition, if there is an advertisement break, the management application 50 may splice different advertisements.
As previously mentioned, the present system provides a consumer specific advertising environment. This environment is provided in part by the providing of online multilayer multicast groups between the access network multiplexer 60 and the set top boxes 110A-110D. The access network multiplexer 60 transmits broadcast transmissions with multicast protocol to a subset A of the set that is connected to the access network multiplexer 60. In the subset A there are different subsets B of consumers watching the same channel at a given moment that are connected to the access network multiplexer 60. Within a single subset B, consumers are associated by their profile for advertising. When there is an advertisement break, the access network multiplexer 60 is transmitting an additional layer of multicast, where each different subset Bi is receiving different advertisements according to the advertisement profile associated with subset Bi. Finally, when the advertisement break is over, subset A consumers continue to watch the same service.
While the abovementioned provides an example of an IPTV network 10, a different infrastructure in which the present system and method may be provided includes a cable network 400.
Referring the
Another example of a network in which the present system and method may be provided is a satellite network.
The satellite 550 is capable of reflecting received data to satellite dishes 560A-560N capable of receiving data signals from the satellite 550. Each satellite dish 560A-560N is associated with a customer premises 570A-570N, such as, for example, a home. In addition, each customer premises 570A-570N has at least one set top box 580A-580N located therein.
Still a further example of a network in which the present system and method may be provided is a terrestrial network.
The radio tower 650 is capable of reflecting received data to antennas 660A-660N capable of receiving data signals from the radio tower 650. Each antenna 660A-660N is associated with a customer premises 670A-670N, such as, for example, a home. In addition, each customer premises 670A-670N has at least one set top box 680A-680N located therein.
In accordance with the present invention, the management application 50 identifies the consumer profiles that are using video/audio decoders (i.e., set top boxes) in the network 10. For exemplary purposes the example of a single household having two television sets is provided. Each television is connected to a different set top box. A first television A is located in the living room and a second television B resides in a room for children.
In accordance with the present example, there are three consumer demographic profiles in the household, namely:
1. Profile 1: Male adult of age 37
2. Profile 2: Female adult of age 34
3. Profile 3: Male child of age 8 and male child of age 10
The consumer profiles are associated with the television sets as follows:
Television A—profiles 1, 2, and 3 (all the household residents are consuming content via television A).
Television B—profile 3 (only the children are using television B)
The process of identifying and associating consumer profiles to set top boxes may be separated in accordance with whether a supervised learning process is used or an unsupervised learning process. These two scenarios are described separately hereinbelow, although it will be noted that certain steps in the processes are similar.
In accordance with the present example, for both the supervised and unsupervised scenarios, service providers have no knowledge of the profiles existing in the household, the location of the television sets in the household, and/or associations between the television sets and the profiles. Instead, the management application 50 identifies and associates the consumer profiles with the set top boxes.
Supervised LearningReference is now made to the flowchart 300 of
As shown by block 306, set top boxes 110 in the network 10 record all of the zapping events that the consumers are creating. In accordance with the present description, and as is known by those having ordinary skill in the art, zapping refers to the switching from the current service to another service via use of, for example, but not limited to, a remote control or pushing buttons on the video decoder. It should be noted that this use of remote controls is provided for exemplary purposes. Instead, zapping may be associated with switching initiated by voice commands, or even consumer motions without pressing buttons.
As shown by block 308, the set top boxes 110 send the zapping events to the management application 50. The management application 50 then associates behavior of consumers and their zapping pattern with the households that either did not return the questionnaire or that never received a questionnaire (block 310).
The association process is a learning process, also referred to as a business process, which is the process of passive platform audience learning and identification, and targeted platform rating calculation and analysis. The learning process is divided into multiple steps, including data collection, modeling, learning, identification, analysis, and post processing.
Data Collection
Referring to
Modeling
Modeling is the process of converting the zapping log into different data models that could be used by different learning and identification algorithms, thereby providing a set top box signature (block 704). In accordance with the present system and method, at least the following data models are recognized. A first data model that is recognized is a set top box viewing signature. Regarding the set top box viewing signature, for each set top box, the list of “watched” programs could be created based on the zapping log and reconciled broadcast schedule. For each watched program, an aggregated watching percentage is given. As an example, STB1 watched program number 56, 30%, means that STB1 watched 30% of the program, on overall (including leaving the program and getting back to it), during the whole time of broadcast of program number 56. A second data model that is recognized is a set top box time signature. The set top box time signature is, for each set top box, the list of percentages of viewing every channel during the specific time aggregated for weekdays. As an example, set top box 1 (STB1) watched CNN on Sundays between 12:00 and 13:00, 25%, means that during the learning period, the average time that this particular set top box watched CNN between 12:00 and 13:00 on Sundays was fifteen minutes.
A third data model that is recognized is a set top box zapping frequency signature. Specifically, every profile does zapping with different frequencies. Calculating zapping frequencies of every set top box during the predefined time periods provides a Zapping Frequency Signature.
Unfortunately, the zapping log is not noise free. Most of the viewers use the remote control in the same fashion, but there is a small minority of users that would use the remote control differently. This affects the general zapping frequency, surfing periods (when the viewer changes the channels with high frequency in order to find something interesting), etc. In order to handle these irregular behaviors, a set of data filters should be applied to the zapping log prior to modeling.
Learning
For supervised learning, learning is a process in which the set top box signatures (viewing, time, and/or zapping frequency), created at the data modeling stage, are used with a list of set top boxes and profiles to provide an Association Rule (block 706). The Association Rule provides knowledge of how to associate a list of profiles within a network to a set top box within the network. The Association Rule is determined due to not having received filled out questionnaires from all parties and wanting to determine unknown relationships between profiles and set top boxes.
It should be noted that during supervised learning, it is not determined which profiles are associated with which set top boxes. Instead, as mentioned above, an Association Rule is determined to provide knowledge of how to associate a list of profiles to each set top box.
As mentioned above, during supervised learning there is an association of set top box signatures (e.g., viewing) for each set top box in the data model to a predefined list of profiles, based on a sample, for further use in the identification functionality. A sample is a partial list of set top boxes for which both the zapping log and the list of profiles associated with each set top box are provided. The sample may be provided by an operator of the set top box collection. Predefined profiles can be, for example, but not limited to, demographic profiles that define gender, age, marital status, income level, or psychographic (behavioral) profiles.
The Association Rule can be applied to any set top box in the same network, as is performed during identification. An example of a process that may be used to derive the Association Rule follows. The management application 50 contains knowledge of the current consumed service for a specific decoder, the profiles (demographic, or behavioral) associated with a specific decoder and household, and previously consumed content for a specific decoder. In accordance with the present invention, the management application 50 uses inference functionality to determine the current viewer/listener profile. The inference functionality defines the current profile(s) that is/are consuming the service.
An example of inference functionality follows, where the learning functionality uses Bayes rule. At this point, the management application 50 contains knowledge of the current consumed service for a specific decoder (set top box). In addition, the management application 50 knows the demographic profiles associated with a specific decoder and household. Further, the management application 50 knows previously consumed content for a specific decoder, specifically, the short-term history. The management application 50 may then use the inference functionality to determine the current viewer/listener profile.
An example for the inference functionality using Bayes rule is provided hereinafter. In the learning algorithm, data collection determines the distribution of the consumed content as a function of the classification of the viewers/listeners at the household. In addition, using the data in conjunction with the Bayes rule, the probability that the household contains a viewer/listener belonging to each demographic profile is estimated. Data utilized to perform this process includes probabilities of each consumed service for households containing each of the demographic profiles, as well as probabilities of each consumed service for households not containing each of the demographic profiles.
Bayes rule reads as shown by equation one below.
P(C|F1 . . . Fn)=P(F1 . . . Fn|C)*P(C)/(P(F1 . . . Fn|C)*P(C)+P(F1 . . . Fn|˜C)*P(˜C)) (Eq. 1)
In equation one, P(F1 . . . Fn|C) is the probability that a household containing a certain profile (C) consumes the list of services F1 . . . Fn and does not consume any other service. In addition, P(F1 . . . Fn|˜C) is the probability that a household not containing a certain profile (C) consumes the list of services F1 . . . Fn and does not consume any other service. Further, P(C) is the probability that a household contains profile C, regardless of the services consumed and P(˜C) is the probability that a household does not contain profile C, regardless of the services consumed.
P(F1 . . . Fn|C) and P(F1 . . . Fn|˜C) may be approximated as the products P(F1|C)* . . . *P(Fn|C) and P(F1|˜C)* . . . *P(Fn|˜C) respectively, which may be calculated directly from the statistics gathered for the sample population. Better approximations may be obtained by considering correlations between services and between profiles in a household. From the above calculation, the result is the probability, P(C|F1 . . . Fn) that a household contains profile C, given the list of the household consumed services. The collection of all values P(C|F1 . . . Fn), calculated for the whole of sample set top boxes represents the Association Rule used for the identification step, applied to each set top box in the network, which was not part of the sample set top boxes. In addition, from this calculation, the result is the probability that a certain individual viewer from a specific profile used the set top box.
In accordance with an alternative embodiment of the invention, a sample may be provided, and post processing may be provided to associate content with profiles. Specifically, a sample may include at least one profile, a set top box associated with the profile, and zapping information associated with the set top box. Post processing may then be performed on the sample to determine which content (e.g., advertisement) is most appropriate for providing to the consumer associated with the profile. As a result, in accordance with this alternative embodiment of the invention, the learning process is not required.
Identification
Identification is a process of recognition of a list of profiles as being associated with a certain set top box (STB), based on the learning results. Every set top box in the network should be assigned with at least one profile (demographic, or behavioral). It is conceivable to assume that in front of a set top box, mostly there is more than one active profile and there are cases where the same profile should be associated a few times to the same set top box. Thus, for each set top box there should be assigned one or more profiles. For example, a young couple (male & female) between the ages of 20-30 that are living together would produce 2 profiles, specifically, one for the female and the other for the male. As another example, if a specific household has two boys of the ages seven and fourteen, the boys may both be assigned to an appropriate set top box as the same profile, “Male 6-18.”
To determine the list of profiles associated with a set top box, the Association Rule is mathematically applied to the list of set top box signatures (block 708).
Analysis
Analysis is the process of breaking down and studying the results of learning and identification in order to estimate possible identification errors, provide a set of different factors and amendments for post processing, association of definition of profiles by signatures to a third party definition, and any other functionality resulting from studying the learning and identification results.
The identification error analysis may be performed via mathematical modeling means and/or via simulation (empirical) means. For example, estimation of expected identification errors may be achieved via applying the learned results to a part of the sample and simulating the identification results.
Post Processing
Post Processing is the process of calculating the data required for presentation to potential customers, such as, targeted rating. Post processing also includes reporting and analyzing based on results of identification. The aforementioned list of results is obtained via post processing functionality described hereafter. Such functionality may be provided by, for example, algorithms. Post processing may be utilized to calculate the following data, although post processing calculation is not intended to be limited to calculating only this data; rather, by post processing any calculation done with the use of the results obtained from the learner and/or identifier is referred to as a post processed calculation/algorithm.
Targeted RatingTargeted rating may include a percentage of viewers of a specific profile that consumed content, a percentage of viewers of a specific profile that consumed content from a channel during a specified time period, or a percentage of viewers of a specific profile that consumed content provided within the network during a specified time period. It should be noted that the term “consumed” is used herein instead of the term “watched” since content consumed by a viewer profile not only includes content that is watched by a viewer profile, but also content that is not watched, but that is provided to a set top box associated with a viewer profile, such as, but not limited to, audio content.
Herein, content may be, for example, but not limited to, a program. It should also be noted, that for exemplary purposes, the following provides the example of consuming content comprising watching content, however, one having ordinary skill in the art will appreciate that consuming of content need not be limited to watching content, but instead may include other functions such as, but not limited to, listening to content received from a channel.
More specifically, targeted rating functionality calculates the targeted rating of a content per profile (e.g., using optimization algorithms, see examples herein below) of the learned and identified data, or of any independent data (e.g., obtained from the sample) as long as the data contains information about the set top box signatures (e.g., viewing signatures) and the profile(s) associated to each set top box in the input. As an example, the targeted rating functionality may be used on data resulting from the supervised learning functionality, unsupervised learning functionality, or independent data. It should be noted that herein set top box signatures includes one or more set top box signature.
Targeted rating may include targeted program rating, targeted channel rating, and targeted time interval rating. Targeted program rating is a percentage of viewers of a specific profile that watched a program. In addition, targeted channel rating is a percentage of viewers of a specific profile that watched a channel during a specified time period. Further, targeted time interval rating is a percentage of viewers of a specific profile that watched content broadcasted within the network during a specified time period.
Targeted rating determination may be provided in general or regionally. Specifically, a regional targeted rating is a targeted rating for one region, where a region may be limited to, for example, a specific geographical location. Alternatively, general targeted rating is a targeted rating for an entire network, or a part of a network, which is region independent (for example, it may include one or several combined regions).
As shown by block 954, set top box signatures are also received, or obtained, for use in determining targeted rating. Such set top box signatures may be, for example, but not limited to, viewing signatures, time signatures, high-resolution time signatures, or zapping frequency signatures. It should be noted that other set top box signatures may also be provided for by the present system and method.
The type of set top box signature used in targeted rating determination dictates which kind of targeted rating will result. As an example, when viewing set top box signatures are used, targeted program rating results. In addition, when time set top box signatures are used, targeted time interval rating, or targeted channel per a time interval rating, results.
As shown by block 956, a first input set is derived showing the probability that each profile is associated with each set top box. It should be noted that the first input set is derived by performing the learning and identification processes, or is received from an external source. A second input set is derived containing data of set top box signatures (block 958). It should be noted that the second input set is derived by performing the modeling functionality on the collected/received zapping log. As an example, for a viewing signature, the zapping log may contain information showing whether a certain set top box consumed certain content (for example, a program), or not. For purposes of deriving the desired output set, namely, the set of targeted ratings, it is assumed that the data of the set top box signatures can be approximated by certain operations involving data associating profiles to set top boxes and targeted rating.
As is shown by block 960, certain operations are applied on the set of data associating profiles to set top boxes and the set of data containing set top box signatures (the input sets), resulting in a targeted rating (the output set). Different forms of data sets and different operations may be used to provide the targeted rating. As an example, matrices may be used to derive the targeted rating, where it is assumed that multiplying a matrix A (matrix A shows the probability that each profile is associated with each set top box) by a matrix B (matrix B is the targeted rating) would result in a matrix C (matrix C is the set top box signature data). Of course, other examples of operations may be used. Two examples of operations that may be used to determine targeted rating are provided below.
If the network covers more than one region and information on the regions in which the different set-top boxes in the network reside is available, a regional targeted rating (RTR) may be calculated using similar methods to those described below. In addition, regional targeted rating of high-resolution time steps, where a time step may be for example, but not limited to, per each thirty seconds, may be calculated for each specific channel and profile.
Input to the regional targeted rating functionality includes the region in which each of the set top boxes is stationed, the set top box signatures for set top boxes within that region, such as, but not limited to, viewing signatures, time signatures, zapping frequency signatures, and high-resolution time signatures, and lists of profiles associated with each of the set top boxes within the region, from any source. It should be noted that a region may have one or more set top boxes therein. In addition, a set top box may be located within more than one region.
The output of the regional targeted rating functionality is the percentage of viewers of each predefined profile, within a specific region, that watched each of the contents, for example, programs, in the case of when viewing signatures are the input, or of each channel at a certain time interval, in the case of when time signatures are the input.
Two examples of methods that may be used to calculate targeted rating are provided herein below. It should be noted that the present invention is not intended to be limited to the following examples, but instead that the following examples are merely provided for exemplary purposes and are not intended to limit the present invention.
EXAMPLE 1An example of a method to calculate targeted rating, given a list of set top boxes with viewing signatures and profile(s) associated to each set top box, can be given via the use of a linear regression optimization algorithm. In calculating the targeted rating, it is assumed that multiplying the set of parameters representing the association of profile(s) to set top boxes (let us call it A) by the aggregation of targeted rating values of each of the profiles per each program watched by at least a portion of the set top boxes of the network for which the zapping log contains records of set top box zapping signatures (the yet unknown and desired output, let us call it B) corresponds to the parameters representing the aggregation of the set top box viewing signatures (part of the input, let us call it C).
For purposes of this example, it is assumed that the sets of parameters A, B, and C are utilized to provide matrices A, B, and C. A minimization algorithm on the squared norm of the matrix (AB−C) may then be performed (a random initial guess is provided to the algorithm for the values of B). In other words, given A and C, the output of applying this algorithm is the set of probabilities, B, representing the probability of each profile to watch each of the programs broadcasted to the collection of set top boxes. An example table for such an output is presented below after example 2 is described.
EXAMPLE 2As a second example of a method to calculate targeted rating, the matrices A, B and C are as in example one, where A is a matrix containing list(s) of demographic, or psychographic, profiles that is (are) associated to each set top box (of the whole network, a part of the network, a specific region within the network, or statistically representing any of those), which is obtained from any source, either via local identification, via receiving an external sample, or via another means.
The matrix C is a matrix that contains, per each of the set top boxes, a list of set top box signatures per a channel, or a program. Examples of forms of set top box signatures include, but are not limited to, viewing signatures, time signatures, high-resolution time signatures or any other form of set top box signatures that associates knowledge of some viewing habits in a certain period per each set top box. The unknown set of probabilities per each of the pre-defined profiles, represented by the matrix B, may then be obtained by the use of solving equation two (Eq. 2):
B≈A+C (Eq. 2)
In equation two, A+ is the pseudo-inverse of the matrix A, which is unique in mathematical terms, thereby insuring that the targeted rating matrix B computed in equation two is well-defined. An example of a pseudo-inverse is the Moore-Penrose pseudo-inverse. Calculating A+ and multiplying it by the matrix C gives a good approximation to the matrix B, of the targeted ratings.
The algorithm of equation two is extremely accurate and allows for the performance of targeted rating calculations on very large amounts of data (more than an order of millions of entries) in an extremely short computing time. Specifically, when performing linear regression, for example, in accordance with one exemplary embodiment of the invention, there is a requirement that for each targeted rating element a separate optimization process is performed, thereby requiring a long computation period. A targeted rating element may be, for example, but not limited to, a program, a time interval, or a channel.
Alternatively, in accordance with another exemplary embodiment of the invention, if a pseudo-inverse is utilitized, performing a matrix multiplication, instead of multiple optimization processes, is very fast and is performed for all the targeted rating elements at once, even if there are tens of thousands of targeted rating elements.
An Example of Data and Targeted Rating Output FollowsIf the pre-defined profiles are:
1. Female of age 30-55 with high income.
2. Male of age 18-40 with average income.
3. Male child of age 6-16 with low income.
4. Female child of age 6-16 with average income.
And the list of programs (as specified in the viewing signatures) is:
1. Saturday night live.
2. Lost.
3. 24.
Then the targeted rating (TR) output would be the following table:
In addition to a targeted rating of a content (for example, program) per profile, a content to viewer profile assignment (C2P) may be determined so as to provide an identification of what content is being consumed by what viewer profile. For exemplary purposes, it should be noted that content may be, for example, but not limited to, a program. Specifically, a content to profile assignment is beneficial to calculate for those set top boxes within the network to which more than one viewer profile has been associated so as to enable determination of which viewer profile of the list of viewer profiles associated with the set top box actually consumed a specific content.
The present description provides examples of how to determine content to profile assignment for illustration purposes only and is not intended to limit the invention to these examples. Specifically, as previously shown above, the learning and identification processes result in an association of at least one viewer profile to a set top box for which a set top box signature is provided. In addition, determining a targeted rating results in a percentage of viewer profiles that consumed content, wherein the content may be, for example, a program. Having the learning and identification process result and the targeted rating result, it is beneficial to determine what content is being consumed by what viewer profile. Similarly an assignment of any content in a specific time slot to a specific viewer profile in the household that consumed this content may be made.
As previously mentioned, obtaining a content to profile assignment involves determining for each content that was consumed by a certain set top box, which is the specific viewer profile, or viewer profiles, of the profiles associated to this set top box, that consumed the content. Alternatively, if more than one viewer profile has a probability of consuming the content, a list of viewer profiles associated to this set top box that consumed the content with certain probabilities may be calculated. This calculation can be done, for example, via use of algorithms applying algebraic manipulations to the sets of parameters representing the aggregation of viewing (or other) set top box signatures (denoted by C, as above), the parameters representing the association of viewer profile(s) to set top boxes (denoted by A, as above), and parameters representing targeted rating values (denoted by B, as above).
Once the association of profile(s) lists to set top boxes is obtained (the input set A), either by performing a supervised/unsupervised learning and identification process, or obtained from an external source, it is possible to utilize statistical, algebraic, or other methods on input set A, together with the set top box signatures of the set top boxes (input set C), and the set of targeted ratings B, to infer the specific viewer profile that watched each specific content via any given set top box. The targeted ratings may be obtained either by one of the methods described above, or by other methods, or received from an external source.
For exemplary purposes,
Data representing relationships between viewer profiles and set top boxes is received (block 1004), wherein the data may either be obtained after performing learning and identification processes, as described herein, or received from an external source. Such data includes an association between at least one viewer profile and at least one set top box. Preferably, the data is provided as a list of viewer profiles that are associated with a specific set top box.
Including the functionality of block 1002 and block 1004, the result is an association of content consumed via an associated specific set top box and a list of at least one viewer profile associated with the specific set top box. These results may be obtained for one or more set top boxes within the network, wherein the content to profile assignment may be determined for each such set top box.
As shown by block 1006, operations are performed on the set top box signatures and the association of viewer profiles to set top boxes to obtain content to profile assignment. It should be noted that many different examples of operations may be provided. The following provides two examples of operations that may be used to obtain content to profile assignment.
EXAMPLE 1Using the targeted rating of viewer profiles, or other data describing the viewing habits of each viewer profile associated with the network of set top boxes, or associated with a part of the network of set top boxes; and further having the association of viewer profile lists to the set top boxes, as obtained from supervised or unsupervised learning and identification methods, or by an algorithm, or obtained from an external source; then the probabilities for any viewer profile to watch given content are deduced using statistical analysis or any algebraic, or other method. Assuming, for illustration purposes, that in example 1 content is a program, let us denote by Pj(f) the probability that a specific program, denoted by j, was consumed via a certain set top box, denoted as STBi, within the network, by a certain viewer profile, f, identified to be in the list of profiles using this specific STBi. Then, for example, Pj(f) may be calculated as:
where TRj(f) denotes the targeted rating of the specific program j (where program j is a program for which we are determining viewer profile(s) that watched program j) for profile f, and f′ range over the profile list, of profile(s) that had been associated with this STBi, via which program j had been consumed.
The association of the list of profiles to this specific STBi may be obtained by learning and identification processes, or any other method, or received from an external source (or, alternatively, assuming all profiles are associated with this STBi with some probability if no other information is given). In the case that association of the profile f to the STBi, via which program j had been consumed, is given with a certain probability, it is possible to get the probability that the profile f watched the program j in the STBi in a more accurate way, for example by multiplying each targeted rating by the appropriate probability.
Let us note that the accuracy of Pj(f) gets higher if the watching correlations between the different profiles, f′, associated with the STBi that watched program j, is as low as possible. It should be noted that zero correlations means that only one profile, f, out of the list of profiles, f′, which are associated with the STBi, would usually watch the program j.
Applying a maximization on all probabilities Pj(f), obtained for each of the viewer profiles, f, that are associated with the STBi, would then result in obtaining the content to viewer profile assignment, where the profile having the highest probability as determined, is the viewer profile that watched the program. It should be noted that if more than one profile has the same high probability as determined, then both viewer profiles watched the program.
EXAMPLE 2While example 1 usually provides accurate results, it might take a long computation time, in case it needs to be computed for each content, for example for each program, and each set top box that consumed this content (for example, watched the program), separately. Moreover, example 1 depends upon the input set B of targeted ratings.
An alternative example, as shown by example 2, would just apply algebraic manipulations on the sets A and C, described above, where set A is either obtained from the processes of learning and identification, or is received from an external source. It should be noted that in accordance with the second example, there is no requirement for calculating or receiving the targeted rating (set B above).
Assuming for illustration purposes of this example that the sets A and C are matrices and that a content is a program, the following method of C2P may be considered. For each program j, that had been watched via STBi, Pj(f), the probability that a profile f (of a list of profiles associated with STBi) is the one who watched program j, is obtained via algebraic manipulations on the matrices A and C and statistical inference:
Pj(f) is calculated as the number of set top boxes that were associated with profile f via which program j had been watched, divided by the number of set top boxes via which program j had been watched. Then, the quantity Pij(f) is obtained as the probability that STBi contains profile f, and via it program j had been consumed. Then, as in example 1, a maximization on all probabilities Pij(f) may be applied for each of the viewer profiles, f, that are associated with the STBi, thereby resulting in obtaining the content to viewer profile assignment, where the profile having the highest probability as determined, is the viewer profile that watched the program. Again, it should be noted that if more than one profile has the same high probability as determined, then both viewer profiles watched the program.
It should be noted that other methods may be used to associate content to viewer profiles and such methods are intended to be included within the present description.
Total ViewershipFurther, a total viewership may be calculated (using, e.g., a program—time slot map and applying to it a calculation algorithm which utilizes data obtained in the previous steps described here), which is the calculation of total aggregated viewing activities for each of the pre-defined profiles (these may be demographic or behavioral), during a twenty-four hours period for each week day.
For example, having the association of profile(s) with each set top box, represented as a set of probabilities (either obtained as an output from the learning and identification steps or given from an outside source), and given the set top box signatures (e.g., as an output from the data modeling stage), given in addition the broadcasting time table (showing for a pre-defined period of time at which time and date and for which duration each program was broadcasted), the following calculation is performed.
The data is aggregated and modulated in such a form that for each day of the week (24 hours) it is calculated how many of each of the pre-defined profiles watched any content during each of the pre-defined time intervals. For example, if the period decided upon is three months and there were 12 Sundays during this period, the 24 hour period is divided to intervals of 15 minutes and for each such interval it is calculated (using the set top box signatures and the data mentioned above) how many times each of the pre-defined profiles watched any content during each of the 15 minute intervals aggregated for all 12 Sundays on a 24 hours span. Then this information is presented in a graph showing the viewing peaks during a 24 hour Sunday divided to 15-minute slots per each profile. This is done for each day of the week (aggregated to the number of time this weekday appeared during the three months period).
In addition to the abovementioned, a targeted rating distribution may be determined, which involves, for every channel, for every profile, calculating the rating of the channel for every brief period of time (e.g., thirty seconds), for every minimally defined region. Further, a viewership flow may be determined, which includes, for every channel, calculating the number (or percentage) of viewers of every profile that join and leave the channel during every short period of time (e.g., thirty seconds), for every minimally defined region. Still further, creative reports may be determined such as, for example, during an advertisement break, for each second, calculating the rating and viewership flow. All the aforementioned are merely examples of the post processing possibilities.
In the supervised case, with the knowledge gained by the functionality of block 310, for any households that did not fill out the questionnaire, the management application 50 uses identification functionality to associate the rest of the set top boxes 110 with the profiles that are using the set top boxes 110 (block 312). An example of the functionality, which is used as a basis for such an identification functionality, is provided herein below. It should be noted that different relevant learning methods may be used to perform the identification functionality. Examples of such learning methods may include the use of any one of the following, or other learning methods: Bayesian learning, various statistical methods, artificial neural networks; decision trees; k-nearest neighbor; quadratic classifier; support vector machine; various optimization methods, and direct calculation of probabilities. Of course, other learning methods may be used and are intended to be included within the present description.
Viewership FlowUsing the identified profiles data and high-resolution time signatures, a viewership flow may be calculated. It should be noted that a high-resolution time signature is a representation of which channel each set top box watched during each time step of a specific time interval, such as, but not limited to, thirty seconds. In addition, a viewership flow is the number of viewers of each profile that left or joined watching a specific channel during each time interval (e.g., 30 seconds), during a day or any pre-defined time interval. Viewership flow may be calculated using, for example, but not limited to, a high-resolution regional targeted rating, in addition to the data of signatures and lists of profiles associated with each set top box.
Calculation of viewership flow is performed in a few steps. It should be noted that the following is an example of steps that may be used to calculate viewership flow, however, the following example is not the only way to calculate viewership flow and this example is not intended to be limiting. As a first step, the high-resolution regional targeted rating is calculated. Calculation of the high-resolution regional targeted rating provides, per each channel and per each viewer profile, the percentage of viewers of this viewer profile that watched this channel per each time interval (for example, 30 seconds) during each day of a specified period. Such targeted rating may be calculated, for example, but not limited to, using a method similar to the method described in the targeted rating section of the present description, where the word program is replaced by channel per time interval.
To calculate viewership flow, the differences between the targeted ratings of same viewer profiles, per different time intervals, may be calculated to record the change in number of viewers of each profile between successive time intervals. Moreover, using for example, but not limited to, the method described above as content to profile assignment, the number of viewers that left or joined the viewers of each channel at each time interval may be calculated. To summarize: the viewership flow application may contain various descriptions of changes in viewers per channel per time interval. For Examples of the abovementioned include, but are not limited to, targeted rating and the changes in targeted rating per time interval, and number of viewers of each profile who left or joined the viewers of the channel at each time interval.
Unsupervised LearningReference is now made to the flowchart 800 of
To determine viewer profiles one of many methods may be used, such as, but not limited to, using clustering algorithms to find common denominators within a population in association with viewing habits of the population. An example of a method that may be used for profile learning and determination is provided below.
As shown by block 802, set top boxes 110 in the network 10 record all zapping events created by the consumers. The set top boxes 110 send the zapping events to the management application 50 (block 804). It should be noted that the zapping events include an identification of the set top box from which the zapping events were derived. The management application 50 then associates behavior of consumers and their zapping patterns (block 806).
The set top box signatures are the input used by learning functionality (arrow 3) of the management application 50. The learning functionality clusters profiles into groups of profiles that are yet unresolved. It should be noted that an unresolved profile is a profile for which a type is not yet known. Specifically, the learning functionally, which is further described in detail below under the section entitled “learning”, is capable of using the set top box signatures and determining relationships between profiles to derive clusters of profiles, where a type of a profile is not yet known. As an example, an optimization algorithm may be used to cluster the profiles into groups of unresolved profiles, an example of which is illustrated below. The learning step may be performed a few times, to determine the number of existing profile groups available for identification from viewing signature data. This may be done by, for example, but not limited to, throwing out, after each iteration, the profile groups that have similarity to each other, which is greater than a pre-defined threshold.
As previously mentioned, the output of the learning functionality of the management application 50 is clusters of yet unresolved profiles (arrow 4). The clusters of the yet unresolved profiles, together with a profile description (arrows 5), are the input to the profiles determination functionality of the management application 50.
The profiles description is a classification, or definition, of profiles of viewers by groups that associates between, for example, viewing habits and purchasing habits of individuals. The profiles description is provided by an external source, such as, but not limited to, a single source researcher. It should be noted that the profile description input is some external definition of profiles that is fed to the system.
The profiles determination functionality performs a match between the profiles found by the learning functionality (unresolved profiles) and the profiles description from the external source, which determines whether to match the profiles to demographic clustering or to a specific psychographic clustering, for example, by consuming habits. The profile determination with respect to a given profile description may be done, for example, by performing a standard best match procedure on each of the profiles in both groups (unresolved and pre-defined) and by finding the best possible match to each profile from the unresolved group from the defined profiles. It should be noted that sometimes one unresolved profile might fit to two described profiles and vise versa—two or more unresolved profiles can match one profile from the described profiles group.
The output of the profiles determination functionality are the resolved profiles (arrow 6), which are the input, together with the set top box signatures, to an identification functionality (arrows 7).
In accordance with an alternative embodiment of the invention, the learning and the profiles determination functionalities may be performed simultaneously by combining these two functionalities (learning and profile determination) of the management application 50 into one. In accordance with this embodiment, the profiles description and the set top box signatures are both fed as inputs to the learning and profiles determination functionalities (arrows 3 and 5). In this case, the learning and profiles determination functionalities are performed together. The output of the learning and profiles determination functionalities is resolved profiles (arrow 6). In the case of combining these two functionalities, directing the learning process toward the input profiles description may be done by, for example, but not limited to, feeding the described profiles as an initial guess to the optimization process and using the number of the defined profiles as the number of profiles to found.
The resolved profiles are sometimes used together with the set top box signatures as an input to the identification functionality of the management application 50 (arrows 7), to associate each set top box in the network with at least one profile, during which, for example, a quantization process may be performed and each set top box in the network may be associated with at least one profile.
A quantization process is a process during which, rather than having a continuous range of probabilities of having each of the profiles associated with some set top box, some profiles would be decided as not associated to that set top box (due to having a too small probability of being associated), while other profiles would be decided as being associated (with some higher probability, or 1). A quantization process may be performed by, for example, calculating a statistical constant related to the association of profiles to set top boxes (see detailed explanation below) and performing rounding steps. A quantization procedure may be performed at various steps of the learning and identification process.
The identification of lists of profiles associated with each set top box in the network may be performed by, for example, but not limited to, combining the association rule between unresolved profiles to set top boxes and the association rule between resolved and unresolved profiles to create an association rule associating lists of resolved profiles to set top boxes. For example, the association rules may be matrices of parameters and the application of the association rules may be performed, by using matrix multiplication.
The output of the identification functionality (arrow 8) is the identification of which profile(s) uses each of the set top boxes in the network. In other words, the output is an identification of at least one profile associated with each set top box in the network.
The profiles description, set top box signatures, and profiles associated with each set top box (arrows 9) are fed to analyzer functionality of the management application 50, the output of which is an estimation of identification quality and error estimation (arrow 11). Specifically, the analyzer is a self-assessment tool of the management application. The analysis in the case of unsupervised learning is performed with respect to the profiles definition input. The output of the analyzer may be, for example, the quality of the ability of the system to classify the profiles into groups according to the given profile definition, ranking the quality of the input data in view of desired output versus the actual output, and error estimation regarding the accuracy of the identification process.
The estimated errors may be, for example, the expected deviation from the actual situation, and false positive and false negative identification rates. Moreover, correlations between the different profiles groups may be calculated, thereby providing information regarding identification possibilities of certain profiles with respect to their correlations with other profiles. This may be done, for example, by performing comparison of results with known statistics, or by comparing results obtained for all of the network with results obtained from a well representing subgroup of the network.
The identified profiles associated with a set top box are fed as an input, together with the set top box signatures (either the same ones used for the learning and identification functionalities, or others, such as time signatures or high-resolution time signatures) and additional set top box data, if required, to post processor functionality of the management application 50 (arrows 12). The post processing functionality computes various data, such as: regional targeted rating (RTR), content to profile assignment (C2P), total viewership and viewership flow. A description of these functionalities was presented above. Note that the computation of the functionalities of the post processor may remain the same for data (associating lists of profiles to set top boxes) obtained via supervised learning, unsupervised learning, or an external source.
Reporting functionality of the management application 50 uses the computed data to produce business and other reports (arrow 13). As with the supervised scenario, the association process, also referred to as the learning and identification process, is divided into multiple steps. The steps in the association process include data collection, modeling, learning, profiles determination, identification, analysis, and post processing. Of the multiple steps, usually the data collection, modeling, analysis and post processing remain the same for both the supervised and unsupervised processes. The main difference in the supervised and unsupervised processes is in the learning step, which may also include a profile determination step, and which may inflict some differences in the identification steps. Note that the steps of learning, profile determination, and identification are sometimes called here for short, “unsupervised learning”. The unsupervised learning process is further defined herein below.
Learning
For unsupervised learning, each set top box signature is learned to be associated with a certain list of unresolved profiles defined solely using the set top box signatures. Examples of such set top box signatures include, but are not limited to, viewing signatures, time signatures, high-resolution time signatures, and zapping frequency signatures. It should be noted that the main difference from the supervised learning process is that no sample is provided in this case. An unsupervised learning algorithm receives the set top box signatures only as an input, resulting in a classification of profiles into, for example, a certain type of psychographic (for example, behavioral) or demographic profile groups. After the first step (unless the steps of learning and profile resolving are combined) the resulting learned profiles are usually yet unresolved, meaning that their nature is yet to be resolved.
Examples of unsupervised learning algorithms include, but are not limited to, least squares algorithms and algorithms that provide minimization via steepest decent. Other outputs from the learning algorithms include an association of profiles to set top boxes and obtaining a targeted rating of the defined profiles at the same time, thereby providing a probability that a profile is associated with a set top box.
The following is provided as an example of an unsupervised learning algorithm. An input to the unsupervised learning process is the collection of set top box signatures, which is the output of the data modeling process. Assume as an example that these are viewing signatures (although these might be time signatures, etc.), where we denote their parametrical representation by a matrix C. For example, each row of the matrix C may refer to one set top box, and each column of the matrix C may refer to, for example, but not limited to, one program, where the entries of matrix C may be, for example, the portions of the programs that each set top box watched, or, for example, the probabilities with which each of the set top boxes represented in matrix C watched each of the programs represented in matrix C. Let us denote by a matrix A the collection of probabilities, representing viewer profiles association to the set top boxes, where the entries of the matrix A are the probabilities of each of the viewer profiles to be associated with each of the set top boxes. Note that the viewer profiles might be yet unresolved viewer profiles at this stage. Let us denote by the matrix B, targeted rating probabilities. Both A and B are unknown in the case of unsupervised learning. To obtain the desired outputs A and B, we use, for example, but not limited to, the following method. We minimize the squared norm of the difference (AB−C) (see equation three), to obtain the approximation of the matrix C as the product AB. For this, we are using, for example, but not limited to, a convex optimization algorithm (or, for example, some other nonlinear minimization algorithm) under various constrains, such as, but not limited to, that each quantity in A is greater than zero and smaller than one, and each quantity in B is greater than zero and smaller than, for example, 0.5. The following description further describes this process.
Following this example, to determine a possible algorithm for achieving the minimization of the squared norm of the matrix (AB−C), (see equation three), considered above, it is assumed that the population consists of viewers that can be divided into several groups of different profiles, where each viewer may belong to one or more group of viewers profiles. Each such group of profiles is associated, for example, with a behavior pattern in terms of watching habits, where the pattern consists of, for example, but not limited to, the viewing signatures and the targeted rating per content and per each profile, where the targeted rating for the profile is the probability of a viewer of this profile watching each program, or some other definition of content.
Since usually the number of all possible profile groups is low compared to the number of programs and set top boxes in the network, one is actually looking for a low rank approximation of the matrix C, the term low rank (of matrices A and B) refers in this case to the fact that the number of different profile groups is smaller than the dimensions of C, representing for example the number of programs and the number of set top boxes in the network, where due to this low rank the matrices A and B may be obtained using this approximation. One approach to obtaining a low rank approximation of the matrix C is to search for the matrices A and B that minimize the squared norm of the matrix (AB−C). This can be done using, for example, a convex optimization method on the quantity of equation three, which reads:
where n denotes the squared norm of (AB−C), and trace is a known operation on a matrix providing the sum of the diagonal. In order to minimize this efficiently, one may use the derivatives of equation three, described in equations four and five, each of which read as follows:
and correspondingly,
The second derivatives may also be calculated in order to perform this minimization and they are given by the combination of equations six, seven, and eight below:
Using any standard convex optimization technique and the derivatives above with the (convex) constraints 0≦Aij, Bij≦1, a solution of the optimization problem may be found, where the joint dimension of the matrices A and B is chosen as the desired, or expected, number of profiles.
The matrix A is to be understood as the set of probabilities of association of each of the profiles per each of the set top boxes and the matrix B is the targeted rating matrix. Since the matrix A is expected to contain binary quantities (either a profile exists in a household or not), and since the optimal solution is defined up to a multiplicative constant for each profile, it is desirable to find a good quantization criterion for A.
Instead of the above-described example, for the unsupervised learning algorithm, one may consider the slightly more complex example described below. Moreover, these alternative ways may be used to address specific different cases and the present invention is not limited to these examples. An example of an alternative way is, instead of minimizing the squared norm of the matrix (AB−C), minimizing the squared norm of (B−(A+)C), denoted herein by m:
m=∥B−(A+)C∥2 (Eq. 9)
In addition, it is also possible to minimize the squared norm of (A−C(B+)), denoted by v:
v=∥A−C(B+)∥2, (Eq. 10)
where A+ denotes the pseudo-inverse of the matrix A, and B+ denotes the pseudo-inverse of the matrix B. For example, the Moore-Penrose pseudo-inverse may be used. This enables a reduction of the dimensionality of the problem as the dimensions of the later matrices are usually much smaller than of the matrix (AB−C). Further, this approach creates a sharper distinction between the probabilities in A (desired to be binary) and of B (usually small probabilities representing targeted rating) in the minimization process. The pseudo-inverse of a matrix is unique in mathematical terms, hence minimizing equations nine or ten is well defined. In the case of minimizing, for example, the quantity m, one would need to use the derivatives
which involves calculating derivatives of the form
where:
The result of applying the derivative in equation eleven to obtain the derivatives
so as the second derivatives, of the quantity m, results in slightly longer expressions than the derivatives presented above, in equations 4-8, but similar in nature.
Moreover, instead of using convex minimization routines, we may use various nonlinear minimizations with slightly altered constrains to minimize the squared norms of the differences above.
An initial guess, for example, but not limited to, a random guess, is given to the algorithm for any of the probabilistic quantities in A and B. Additional constrains may be given to the algorithm to increase its accuracy. Of course, other optimization (or learning) algorithms may be used. The output is a set of probabilities, A, associating groups of profiles to the set top boxes, which later may be quantized and/or resolved (using, when needed a profile resolving procedure and quantization), and a set of probabilities, B, providing the targeted rating for each (for example) program and each profile (also to be used in the profile resolving scheme when needed). It should be noted that the targeted rating may be re-calculated during the post-processing to increase the accuracy.
It should be noted that the abovementioned examples, equations, and functionalities are based upon the general premise that matrix C can be approximated by matrix A multiplied by matrix B. Of course, further examples for achieving such approximation may be provided and such examples are intended to be included within the present invention.
Quantization
The quantization step is typically, but not necessarily, to be used after the learning and profile determination stage, in the identification functionality, or a few times during the steps of learning, profile determination, and identification.
One approach to finding the quantizing constants (a set of constants that each of the probabilities relating each of the found profiles to set top boxes should be divided by to determine whether a certain profile should indeed be associated with a certain set top box or not) is to assume that A is approximately a binary matrix with a constant multiplicative factor per column, si (1≦i≦number of profile groups), or in other words, assume that each of the i profile groups has its own quantization constant. Since the entries are supposed to be binary quantities, one expects the following from calculating the mean and variance using the binomial distribution, as shown by equations 12 and 13.
ΣaAai=siNp (Eq. 12)
ΣaAai2/N−(ΣaAai)2/N2=si2pq (Eq. 13)
where N is the number of set top boxes in the network, p is the probability that a profile is associated to a set top box, and q=1−p. Solving equation twelve and equation thirteen for si, dividing Aai/si and rounding to a pre-defined threshold, leads to an association rule, associating each of the profiles (resolved or yet unresolved) to each of the set top boxes.
Profile Determination
Profile determination, or resolving, is a process that defines the nature of identified profiles. During profile resolving, profiles definition, for example from a single source research results, such as, but not limited to, viewing habits and behavior, may be used as inputs. In addition, the profile list and targeted rating of defined profiles may be used as inputs. The inputs are provided to a resolving algorithm resulting in profile descriptions that describe each profile in the list.
The single source research addresses a focus group that answers a questionnaire. There are two groups of questions in this questionnaire, namely, a first group and a second group. The first group refers to identity of a person, examples including behavior (i.e., purchasing behavior, rest and relaxation preferences, etc) and demographic profile of the answering person. The second group refers to media consumption, for example, about the time a person would watch television each day of the week and his preferred shows.
The single source research associates the media consumption habits with other habits, such as, but not limited to, purchasing habits and preferred vacation habits. The output of the single source research is a set of profiles and their habits, while each profile is associated with its media consumption habits. The resolving algorithm finds the best correlation between two sets of data, namely, for example, the media consumption habits of the focus group; and, for example, the targeted rating of the defined profiles (the output of the unsupervised learning algorithm). Therefore, the resolving algorithm has the capability of defining the traits of the learned profile in the unsupervised algorithm.
In accordance with the present invention, after the learning and identification are performed, the management application 50 knows online, or offline, the current psychographic or demographic profiles that are consuming content for at least a portion of the set top boxes of the network for which the zapping log contains records of set top box zapping signatures. The information regarding the current demographic/psychographic profiles that are consuming content for set top boxes within the network for which sufficient input was received, may be the basis for personalized advertisements deployment in accordance with the present invention.
Real Time Targeted Rating (RTTR)The present system and method provides the capability of determining whether a set top box within a network is on or off. In addition, if the set top box is on, the present system and method provides the capability of identifying in real time, or near real time, which viewer profile is currently consuming content provided by the set top box, what is the targeted rating of the viewer profile, or profiles, currently consuming content provided by the set top box, and a targeted rating of all viewer profiles that consumed content of the set top boxes within the network, which are part of the real time targeted rating system, for a predefined time interval. This real time process is referred to herein as the real time targeted rating. As previously mentioned, the content may be, for example, but not limited to, video, audio, data, or any combination of these. As will be described in additional detail herein, the real time targeted rating functionality uses the functionalities and methodologies mentioned above with regard to supervised learning, unsupervised learning, identification, content to profile assignment, and targeted rating. The real time targeted rating process is described in detail hereafter.
Functionality performed in real time targeted rating may be performed by a separate or the same management application of the present system and method, located in a head end or in a different location, or a management application located in a different location, as described hereinabove. In addition, the functionality may be performed by a separate computer and/or server (not shown). The embodiments are intended to be covered by the present description. It should further be noted that certain functions of the real time targeted rating process may instead be performed by the set top box itself.
In accordance with the present invention, queries can be made by users of the present system and method for execution of the real time targeted rating functionality for each set top box within the network that is covered by the present system and method. These queries may be made, for example, but not limited to, through a remote web client. For example, multiple web based clients may subscribe to the system described herein for retrieving pre-configured reports, reports which are created automatically by the system periodically every pre-defined time interval (for example, 5 minutes), or per query. Such queries may be made, for example, regarding each of the pre-defined viewer profiles to find out in real time, or near real time, whether a set top box in question is on or off, to identify what viewer profile, or viewer profiles are currently consuming (or consumed for the last predefined time interval) content via the set of box in question, to determine a targeted rating for a specific viewer profile that consumed content from this set top box, and to determine targeted ratings of all viewer profiles that consumed content of the set top box, or set top boxes, within a predefined time interval. An example of determining targeted ratings of all viewer profiles that consumed content of the set top box, or set top boxes, within a predefined time interval includes calculating targeted ratings for viewer profiles that were determined to be consuming content provided by the set top box, or several set top boxes, within the last five minutes. Such a determination uses, per each set top box, at least one set top box signature summarizing activities of the set top box for the last five minutes. Further description is provided herein.
The real time targeted rating system may include a part, or all, of the following capabilities: data collection, modeling, learning, identification, content to profile assignment, targeted rating (and/or regional targeted rating), and a reporting capability, which can be utilized, for example, via a web interface, or other interface, to produce, for example, business reports, system reports, or any other reports involving the produced data. These reports may be generated automatically, periodically (for example, every 5 minutes), or per a query, or both. A query may be initiated, for example, by a user of the system, by a web client, or by any other interface interacting with the system and having the capability of making a query. Such a query may be, for example, automatic, or manual, or provided by another method.
The real time targeted rating system may be beneficial for content placement, for example, at a certain time and a certain channel, or a certain time and a certain set top box; where content, may refer, for example, to an advertisement. Other examples of content may be a program, or an audio content, or any other example of content that may be consumed via a set top box.
Returning to
After determining which viewer profiles are currently consuming content, a targeted rating or targeted ratings may be determined (block 1130). The targeted rating may either be a targeted rating for a viewer profile currently consuming content provided by the set top box or the targeted ratings may be targeted ratings of all viewer profiles that consumed content of the set top box, or of several set top boxes, within a predefined time interval. It should be noted that the real time targeted rating process may be repeated after the passing of a predefined time period, per user query, or both. By repeating this process, after the passing of the predefined time period, the real time, or near real time, determination of which viewer profiles are consuming content from the set top box, may change, and is maintained current (in the sense that after each predefined time period, a new determination of what viewer profile(s) are consuming content of the set top box is achieved, maintaining always the most current identification). The time interval is small enough to be considered ‘now’ and big enough to allow for the accumulation of enough data.
Generating reports (for example, busyness reports, and/or system reports), as shown by block 1140, either automated, periodic, per a user query, or both, may also be provided.
On/Off Set Top Box Determination With Real Time Targeted Rating
In accordance with the present invention, the present system and method provides the capability of determining whether content provided by a set top box within the network is being consumed by a viewer profile. Determining whether content is currently being consumed allows the present system and method to determine if a set top box is currently on or off.
One method that is used by the present system and method to determine if content is currently being consumed is to continuously update a set top box zapping signature with the occurrence of each new zapping event associated with a set top box. By continuously updating the set top box zapping signature of the set top box, the set top box zapping signature remains current and may be considered for determining if a set top box is currently on or off. It should be noted herein that, in accordance with the present invention, a set top box is considered to be off not only if no power is being received by the set top box, but also if content provided by the set top box is not being consumed by a viewer profile within a predefined period (for example, if no zapping event occurred during a predefined time period, with or without association to a schedule).
As shown by block 1220, a determination is made regarding when content provided by the set top box is complete. As an example, a determination may be made regarding when a video program or audio program is complete. After completion of content provided by the set top box a predefined time period is allowed to pass (block 1230). As shown by block 1240 a determination is then made regarding whether a zapping event has occurred prior to the expiration of the predefined time period after the completion of content provided by the set top box. If the predefined time period expired and no new zapping event occurred, the set top box is considered to be off. Alternatively, if a zapping event occurred within the predefined time period, the set top box is considered to be on. Alternatively, if for example a schedule is unavailable, the determination whether a set top box is on or off may be achieved by checking if a zapping event occurred during an elapsed predefined time period.
Determining whether the set top box is on is important for multiple reasons. One such reason is that content provided by or to a set top box when the set top box is considered to be off should not be considered when determining whether a viewer profile, and what viewer profile, is currently consuming content from the set top box. Such determination provides for a more accurate determination of current content consumed by a viewer profile. This determination is important when determining if and what advertisement, or other content, to send to a set top box for consumption by a viewer profile. Specifically, if no one is consuming content provided by a set top box, resulting in the set top box being considered to be off, there is no benefit in forwarding advertisements to the set top box. In fact, determining whether a set top box is on or off is important for other calculations performed by the present system and method. If a set top box is considered to be off, perhaps due to a lack of zapping events occurring, content being provided by the set top box, or that is available to the set top box, should not be considered for calculation purposes, such as in determining which viewer profile is associated with which set top box. Specifically, for example, when a set top box is off, the input set C, which has been used in many calculations described hereinabove, may have a value of zero, in a place representing content transmitted during a time when the set top box was considered by the application to be off. Alternatively, depending on data representation, the set C may contain no entries corresponding to time intervals, or contents available to the set top box, during which the set top box was determined by the system to be off.
It should be noted that the updating of the set top box zapping signature may instead be updated in accordance with a predefined schedule so as to alleviate the need for acquiring and processing a schedule, for updating the set top box zapping signature with each new zapping event. As an example, if the real time targeted rating process is being performed every five minutes, it would be beneficial to have the set top box zapping signature updated at least every five minutes. Of course, the timing in which the set top box signature is updated may have many different values.
It should also be noted that, in accordance with alternative embodiments of the invention, as mentioned above, a broadcast schedule may not be necessary for determination of whether a set top box is on or off. Specifically, a time gap between zapping events may be considered to determine if a set top box is on or off. As an example, if a predefined time period passes without any zapping events occurring, a set top box may be considered to be off. Other methods of determining whether a set top box is on or off may also be used, and such methods are intended to be included within the present description.
As previously mentioned, with determination that a set top box is on, the real time targeted rating functionality determines what viewer profile or profiles are currently consuming content provided by the set top box. The real time targeted rating functionality applied depends upon whether supervised or unsupervised learning was performed by the present system and method for determining what viewer profiles are usually associated with which set top boxes. Herein, the term usually is used to distinguish between currently (i.e. in real time, or nearly real time), and during a ‘relatively long’ period of time during which data was collected. The data regarding the ‘usual’ (rather than current) association of viewer profiles to a certain set top box, may also be periodically updated, for example every three months (or any other time interval, which is longer than the time interval defined as current).
The following describes the real time targeted rating process used for determining what viewer profile or profiles are currently consuming content provided by the set top box, in accordance with the present system and method, when a determination has been made that a set top box is on. As mentioned above, after determining that a set top box is on, the real time targeted rating process then depends upon whether supervised or unsupervised learning was performed by the present system and method for determining what viewer profiles are associated with which set top boxes in real time, or a nearly real time.
The following first illustrates steps taken in real time targeted rating when supervised learning was performed to determine what viewer profiles were associated with which set top boxes. Thereafter, illustration is provided of the steps taken in real time targeted rating when unsupervised learning was performed to determine what viewer profiles were associated with which set top boxes. It should be noted that the following provides examples of processes that may be performed during the real time targeted rating process and the invention is not intended to be limited to the same.
Real Time Targeted Rating with Supervised Learning
Referring to the supervised learning scenario, the Association Rule derived after performing supervised learning, as previously described, is gathered. As previously mentioned, the Association Rule provides knowledge of how to associate a list of profiles within a network to a set top box within the network. A list of one or more of the viewer profiles that are determined to be associated with the set top box, as determined after performing the identification process, are gathered. The identification process is not repeated here since it has been described in detail hereinabove.
To determine which of the list of one or more viewer profiles that were determined to be associated with the set top box are currently consuming content provided by the set top box, the previously obtained association rule, together with the list of one or more of the viewer profiles that were determined to be associated with the set top box, through the identification functionality previously described, are applied to a newly obtained set top box signature for the set top box in question. Specifically, the set top box signature used is one that is current, or one that has been updated at least within a predefined period.
Alternatively, instead of using the association rule, but still using the list of one or more viewer profiles determined to be associated with the set top box, the present system and method may provide real time, or near real time, determination of the one or more viewer profiles that are currently associated with a set top box by applying the content to profile procedure, as previously described, to currently consumed content, as identified by a set top box signature, and to the list of the one or more viewer profiles determined to be associated with the set top box.
To apply the content to profile procedure the process used by the content to profile functionality is performed. Specifically, for example, a set A and a set C are provided, where the set A is a list of one or more viewer profiles associated with a set top box within a network, and set C is a summary of which set top boxes within the network consumed content. To determine which profiles consumed content within the last predefined time period, via use of the content to profile functionality, we start with a summary of which set top boxes within the network provided content, which was consumed, within the predefined period. This summary of which set top boxes within the network provided content, which was consumed, within the predefined period can be obtained by reviewing, and/or processing, the set top box signatures of each set top box in the network.
Having the list of set top boxes that provided content within the last predefined time period, a determination is made as to which profiles are associated with the set top boxes that provided content within the predefined period. As an example, profiles f1 and f3 may be associated with set top box 1 (STB1), and profiles f1 and f2 may be associated with the set top box 2 (STB2). This example may be represented as STB1 has (f1, f3), and STB2 has (f1, f2).
A determination is then made as to the probability that, within the predefined time period, a specific profile consumed content provided by a set top box that provided content within the predefined time period. An example of a method that may be used to determine the probability follows. If there are ten set top boxes in a network that provided content within the predefined period, and five of these set top boxes are associated with profile f1, while four of these set top boxes are associated with the profile f2, the probability that a profile f1 consumed content within the predefined period, wherein the content was provided by a set top box that provided content within the predefined period, can be represented as P(f1)= 5/10. In addition, the probability that a profile f2 consumed content within the predefined period, wherein the content was provided by a set top box that provided content within the predefined period, can be represented as P(f2)= 4/10.
The probability that one or more viewer profiles associated with a specific set top box consumed content within the predefined period, from the specific set top box, is then considered by selecting probabilities having values closest to one (for example, P(f3)=0.93), where the probability is for a profile known to be associated with the specific set top box, and the specific set top box provided content, which was consumed, within the predefined time period. Profiles associated with the probability having a value closest to one (those with the maximal probability per the selected set top box) are selected as the profiles that consumed content from the set top box within the predefined period. It should be noted that this example may be made more accurate if, to the calculation, the probabilities of association of each of the profiles to a specific set top box, and/or the probabilities of the presence of each of the viewer profiles within the network, are added.
It should be noted that the above is merely an example, and any other method of calculating content to profile assignment, as described herein above, or in any other form, may be used.
Real Time Targeted Rating with Unsupervised Learning
For the unsupervised learning scenario, completion of the unsupervised learning process and the identification process results in a list of one or more viewer profiles associated with a set top box in question. The list of one or more viewer profiles that are determined to be associated with the set top box, are gathered.
To determine which of the list of one or more viewer profiles that were determined to be associated with the set top box are currently consuming content provided by the set top box, the list of one or more of the viewer profiles determined to be associated with the set top box is applied, and possibly together with the obtained and gathered association rule (as described in the unsupervised learning portion herein above), to a newly obtained set top box signature for the set top box in question. Specifically, the set top box signature used is one that is current, or one that has been updated at least within a predefined period. For example, such application may include performing all steps described in the unsupervised learning process, but with the input of only the at most few resolved viewer profiles that were previously determined to be usually associated with the specific set top box.
Alternatively, instead of applying the list of one or more of the viewer profiles determined to be associated with the set top box in question, and possibly the previously obtained association rule, to a newly obtained set top box signature for the set top box in question, the present system and method may provide real time, or near real time, determination of the one or more viewer profiles that are currently associated with a set top box by applying the content to profile procedure, as described above with regarding to the supervised process.
Further Illustrations and Examples of RTTR Method
With the supervised and unsupervised scenarios described above, it should be noted that the present system and method is also capable of determining what viewer profiles, possibly from a pre-defined list, are currently consuming content provided by the set top box, even if there is no previous data or knowledge regarding viewer profiles associated with set top boxes, or previous learning or association rule data. In such a situation, the set A is missing, where the set A is a list of one or more viewer profiles associated with a set top box within a network. The set C can then be obtained for a predefined period, such as, but not limited to, the last five minutes, where the set C is a summary of set top box signature(s); meaning a summary of viewing habits, containing for example a summary of which set top box(es) within the network consumed content, per different identifications of contents (for example, which set top box(es) consumed which programs). With there being a sample, the supervised process mentioned above may be performed, resulting in a viewer profile or profiles that are currently consuming content provided by the set top box. Alternatively, if there is no sample, the unsupervised learning process described above may be performed, resulting in a viewer profile or profiles that are currently consuming content provided by the set top box. As has been previously mentioned, herein, the term currently consuming is intended to be the same as consuming within a predefined period.
It should be noted that, with regard to set top box signatures, the method of real time targeted rating includes the steps of data collection (for example, zapping log and schedule), and modeling, periodically, every pre-defined time interval (for example, every five minutes), to obtain the set top box signatures. Alternatively, to obtain a set top box signature, the collection and the modeling may be performed per each event occurring at the set top box, for example, any interaction of a viewer profile with the set of box, such as pressing the info button.
By the present system and method performing the real time targeted rating functionality, the system and method contains the following data, or is ready to obtain the same upon request: either an association of at least one viewer profile to at least a one set top box within the network, that had been consuming content using this set top box during the last short pre-defined time interval (for example five minutes), or report that the set top box at question being shut off during this time interval; then, the targeted rating of each of the pre-defined in the system viewer profiles for the last time interval (for example, five minutes) may be provided.
It should be noted that while examples of time intervals for updating set top box signatures and other content are exemplified as being five minutes, the time interval is not limited to five minutes, but instead may be any other time interval.
The following provides an example of a way to calculate and operate the real time targeted rating functionality.
As one example, the real time targeted rating application may receive as an input the results of learning (supervised or unsupervised) from the management application, performed for any period of data collection, in the form of ‘learned sets’, which are sets of parameters providing an association rule between set top boxes within the network and at least one viewer profile (for example demographic or psychographic). In addition, the real time targeted rating application may receive as an input any set top box information, such as the identification of viewer profiles that had been associated to this set top box, via learning and identification procedures, performed, for example, by a management application at any location, or at the real time targeted rating application server, or received from an another external source. Other set top box information, may include, for example, the region in which the set top box is located, and/or other status information regarding the set top box.
Any additional inputs, obtained for the set top boxes within the network, or for viewer profiles, at an earlier time, such as the description of viewing habits of profiles or set top boxes within the network, or any other relevant information may be used as well.
The output of the real time targeted rating functionality is the identity of the viewer profile(s) (out of a pre-defined list, for example, a list containing a few demographic profile types or a list containing a few psychographic types or any mixture of those, that had been associated to a specific set top box) that is currently watching each of the set top boxes within the network, that are part of the real time targeted rating system, that data was received for, and that are known not be switched off; and, a targeted rating, or a regional targeted rating for each of the identified profiles, per content provided at the pre-defined time interval (for example, 5 minutes), per which the identification of current viewer profiles was performed. The later identification is referred to as real time, or nearly real time, or online identification and it may take place in real time, or nearly real time, with a pre-defined time delay needed to receive and process the data, or gather sufficient amount of data. One way to obtain these outputs may be, for example, using the identification functionality (either for supervised, or unsupervised learning), described above. The use of the identification functionality in this example, may be performed via applying the currently obtained (for example, for the last 5 minutes interval) set top box signatures (for example, viewing signatures) to the ‘learned sets’, which provide an association rule of list(s) of profile(s) with set top box(es) within the network. This can be done, for example, by using mathematical, or other operations, such as multiplication (for example, multiplication of a vector and a matrix). The later may be done either using the whole ‘learned matrix’ obtained for a ‘relatively long’ pre-defined previous time period (for example, a month), or using just the part of the ‘learned matrix’, which is narrowed, for each specific set top box, only to the list containing at least one viewer profile, which is associated to each specific set top box, for which at least one set top box signature was obtained. Due to the fact that the identification is done on the basis of viewing behavior that occurred in a very short time period (for example, 5 minutes), the identified viewer profile, or profiles, would usually be those consuming content at the specific set top box, in real time, or nearly real time.
The ‘learned sets’, together with the list(s) of profile(s) associated with the set top boxes within the network, may be stored within the real time targeted rating server, or downloaded to the set top boxes themselves (where each set top box would contain only the part of the learned data associated with it). In addition, the set top box signatures may be inferred at the set top box level, or, if the ‘learned sets’ are stored at the server level, set top box zapping signature(s) may be uploaded (per a time interval, or per a zapping event) to the real time targeted rating server, and the set top box signature(s) may be updated if during the pre-defined time interval (for example, 5 minutes) a new zapping event occurred, which was not yet included in the set top box zapping signature. In such a case, the identification may be applied once, or again and again after each zapping event within the predefined short time period (for example, 5 minutes); where the newly obtained signature (the set top box signature obtained, for example for the last 5 minutes period) will usually contain the information regarding the latest occurring zapping event(s).
In the case that during the short predefined time interval (for example, 5 minutes) the set top box signature is updated with each occurring zapping event and the identification process is applied repeatedly with each such zapping event, the identification of the current viewer profile consuming content via a specific set top box within the network is expected to be of high accuracy, as the identification accuracy would increase with each such iteration.
It should be noted that if no previous data including supervised/unsupervised ‘learned sets’ is available to the real time targeted rating system, the system may perform supervised/unsupervised learning and identification per each obtained set top box signature.
To summarize, the real time targeted rating system is capable of receiving previously processed data (such as previous results of learning and identification); continuous real time, or nearly real time, data collection (set top box zapping signatures, and possibly a schedule) for any pre-defined time interval prior to the desired identification, for example, 5 minutes, and in some cases per each zapping event (such as turning the set top box on/off), and processing/modeling capabilities of the continuously collected data. After each such short predefined time interval (for example, 5 minutes), the real time targeted rating system outputs a snapshot of the set top boxes within the network, where for each such set top box, the one or more viewer profile, currently consuming content via the set top box are identified and the targeted rating of the identified viewer profiles may be calculated. Reports (busyness and/or system) may be automatically periodically generated, or may be generated per user query. Queries may be submitted by users of the real time targeted rating system via a Web interface, or any other interface with the real time targeted rating server.
All collected and calculated data may be stored, for example, within the real time targeted rating server, or other location, and may be made available for use for future identification(s)/calculation(s), for any required time period.
The real time targeted rating server operates so that at each given moment a query might be posed to it regarding what are the current viewer profile(s) using set top boxes within the network, which are part of the real time targeted rating system, and which are inferred by the real time targeted rating system to be currently on. As a result to such a query, an output report regarding the identification of a certain viewer profile using these set top boxes by the last identification, or of a few viewer profiles with the probabilities of each of them using these set top boxes, with respect to the last identification performed, is prepared. In addition, a targeted rating, or a regional targeted rating, per each of these viewer profiles may be calculated. If no queries are made periodic automated output reports may be generated by the real time targeted rating system.
The online, real time, or nearly real time identification and the targeted rating calculation may be performed, for example, at the real time targeted rating server, located, for example, at the head end, where the real time targeted rating server receives continuously inputs both from the management application and from the set top boxes, for example those connected to the head end, and automatically performing per each pre-defined time step, and/or per each zapping event occurring at any of the set top boxes, the steps of labeling each of the set top boxes within the real time targeted rating system as being switched on or off, and for those on, who are the viewer profile(s) using it in the current time interval, with or without assigned probabilities, and the (regional) targeted rating associated with each of the identified profiles, and the last time interval for which identification took place.
Alternatively, for example, the ‘learned sets’ may be sent by the real time targeted rating server to each of the set top boxes and stored there, the collection of the last occurring zapping events may be performed at the set top box level and the identification of each viewer profile using each of the set top boxes may be performed at the level of each set top box, where the result is sent back to the real time targeted rating server and the (possibly regional) targeted rating is calculated at the real time targeted rating server. In this example, in addition, the list of profiles associated with set top boxes within the network may be sent to be stored at the set top boxes themselves. Then, the identification of which viewer profile is currently consuming content via the set top box at question, for set top boxes within the network, may be performed out of the short profile list, only out of those fewer viewer profiles, associated to the set top box at question, or from the whole list of viewer profiles, if such a short list is not provided.
The identification of the viewer profiles may be performed in a more accurate way, where the time interval, referring to ‘current identification’, may be narrowed, so that as few profiles as possible are identified as current viewer profile(s) associated with a set top box within the network, that is part of the real time targeted rating system.
Any of the described above methods, or combination of them, or other methods, may be used to address different specific situations.
It should be emphasized that the above-described embodiments of the present invention are merely possible examples of implementations, merely set forth for a clear understanding of the principles of the invention. Many variations and modifications may be made to the above-described embodiments of the invention without departing substantially from the spirit and principles of the invention. All such modifications and variations are intended to be included herein within the scope of this disclosure and the present invention and protected by the following claims.
Claims
1. A method of providing real time targeted rating to enable content placement for video audiences, comprising the steps of:
- determining if at least one set top box, located within a network having at least one set top box, is on or off, wherein being on is defined as a set top box having a zapping event occur within a predefined time period;
- determining what one or more viewer profiles are currently consuming content provided by a set top box within the network, wherein currently consuming refers to consuming within the predefined period; and
- determining targeted rating per a viewer profile that had been identified as currently consuming content via at least one of the set top boxes within the network.
2. The method of claim 1, wherein the step of determining if at least one set top box is on or off further comprises the steps of:
- receiving a broadcast schedule for a set top box, wherein the broadcast schedule contains a timetable of content to be provided by the set top box;
- determining when content currently being provided by the set top box is complete; and
- determining if a zapping event has occurred between the completion of content currently being provided by the set top box and the ending of a predefined period.
3. The method of claim 1, wherein the step of determining if at least one set top box is on or off further comprises the step of determining if a zapping event has occurred before a predefined time period has elapsed.
4. The method of claim 2, wherein the step of determining when content currently being provided by the set top box is complete is performed by processing the received broadcast schedule.
5. The method of claim 1, wherein the step of determining what one or more viewer profiles are currently consuming content provided by a set top box within the network, further comprises the step of determining if a supervised or an unsupervised learning process was performed to derive a list of one or more viewer profiles that are associated with which set top boxes.
6. The method of claim 5, wherein if a supervised learning process is performed, the following steps are performed to determine what one or more viewer profiles are currently consuming content provided by a set top box within the network, wherein currently consuming refers to consuming within a predefined period:
- receiving data providing an association of consumer profiles and set top boxes to households within a network;
- recording zapping events created by consumers, also referred to as zapping patterns of the consumers;
- associating the zapping patterns of the consumers with households.
7. The method of claim 6, wherein the data is provided by performing the steps of providing questionnaires to the consumers and receiving at least some of the questionnaires filled out by consumers.
8. The method of 6, wherein the step of associating the zapping patterns of the consumers with households further comprises the steps of:
- converting zapping logs into different data models that can be used to provide set top box signatures;
- providing the set top box signatures;
- using the set top box signatures with a list of set top boxes and profiles to provide an association rule; and
- applying the association rule to the set top box signatures to determine a list of profiles of the consumer profiles associated with a specific set top box of the set top boxes.
9. The method of claim 5, wherein if an unsupervised learning process is performed, the following steps are performed to determine what one or more viewer profiles are currently consuming content provided by a set top box within the network, wherein currently consuming refers to consuming within a predefined period:
- receiving a zapping log and a broadcast schedule, wherein the zapping log includes records of set top box zapping signatures for at least a portion of the set top boxes of the network;
- deriving set top box signatures from the zapping log and broadcast schedule;
- clustering viewer profiles into groups of viewer profiles using the set top box signatures; and
- associating at least one set top box within the network with at least one viewer profile.
10. The method of claim 1, wherein the step of determining what one or more viewer profiles are currently consuming content provided by a set top box within the network, further comprising the steps of:
- obtaining a first input set containing data of at least one set top box signature, wherein the data of the at least one set top box signature further comprises a processed zapping log containing information summarizing viewing habits of at least one set top box within the network;
- obtaining a second input set, wherein the second input set contains data showing which of one or more viewer profiles are associated with which one or more set top boxes within the network; and
- processing the first input set and the second input set by performing at least one operation on the at least one set top box signature and the association of one or more viewer profile to one or more set top box, wherein the operation performs the steps of: using set top boxes that are consistent to both the first input set and the second input set, wherein these set top boxes are referred to as consistent set top boxes; and performing calculations to identify which of the profiles associated with each of the consistent set top boxes consumed each content that is included in the set top box signatures of the first set that are associated with the consistent set top boxes.
11. The method of claim 5, wherein if a supervised learning process was performed to derive a list of one or more viewer profiles that are associated with which set top boxes, a select set top box of the set top boxes is chosen and one or more viewer profiles of the list of one or more viewer profiles associated with the select set top box is applied to a newly obtained set top box signature for the select set top box.
12. The method of claim 5, wherein if an unsupervised learning process was performed to derive a list of one or more viewer profiles that are associated with which set top boxes a select set top box of the set top boxes is chosen and one or more viewer profiles of the list of one or more viewer profiles associated with the select set top box is applied to a newly obtained set top box signature for the select set top box.
13. A system for providing real time targeted rating to enable content placement for video audiences, wherein the system comprises a head end having a computer and means for communicating therein, wherein the computer has a management application stored therein, and wherein the management application further comprises:
- logic configured to determine if at least one set top box, located within a network having at least one set top box, is on or off, wherein being on is defined as a set top box having a zapping event occur within a predefined time period;
- logic configured to determine what one or more viewer profiles are currently consuming content provided by a set top box within the network, wherein currently consuming refers to consuming within the predefined period; and
- logic configured to determine targeted rating per a viewer profile that had been identified as currently consuming content via at least one of the set top boxes within the network.
14. The system of claim 13, wherein the logic configured to determine if at least one set top box is on or off further comprises:
- logic configured to receive a broadcast schedule for a set top box, wherein the broadcast schedule contains a timetable of content to be provided by the set top box;
- logic configured to determine when content currently being provided by the set top box is complete; and
- logic configured to determine if a zapping event has occurred between the completion of content currently being provided by the set top box and the ending of a predefined period.
15. The system of claim 13, wherein determining if at least one set top box is on or off further comprises determining if a zapping event has occurred before a predefined time period has elapsed.
16. The system of claim 14, wherein determining when content currently being provided by the set top box is complete is performed by processing the received broadcast schedule.
17. The system of claim 13, the logic configured to determine what one or more viewer profiles are currently consuming content provided by a set top box within the network, further performs the step of determining if a supervised or an unsupervised learning process was performed to derive a list of one or more viewer profiles that are associated with which set top boxes.
18. The system of claim 17, wherein if a supervised learning process is performed, the following steps are performed to determine what one or more viewer profiles are currently consuming content provided by a set top box within the network, wherein currently consuming refers to consuming within a predefined period:
- receiving data providing an association of consumer profiles and set top boxes to households within a network;
- recording zapping events created by consumers, also referred to as zapping patterns of the consumers;
- associating the zapping patterns of the consumers with households.
19. The system of claim 18, wherein the data is provided by performing the steps of providing questionnaires to the consumers and receiving at least some of the questionnaires filled out by consumers.
20. The system of claim 18, wherein associating the zapping patterns of the consumers with households is performed by:
- logic configured to convert zapping logs into different data models that can be used to provide set top box signatures;
- logic configured to providing the set top box signatures;
- logic configured to use the set top box signatures with a list of set top boxes and profiles to provide an association rule; and
- logic configured to apply the association rule to the set top box signatures to determine a list of profiles of the consumer profiles associated with a specific set top box of the set top boxes.
21. The system of claim 17, wherein if an unsupervised learning process is performed, the following steps are performed to determine what one or more viewer profiles are currently consuming content provided by a set top box within the network, wherein currently consuming refers to consuming within a predefined period:
- receiving a zapping log and a broadcast schedule, wherein the zapping log includes records of set top box zapping signatures for at least a portion of the set top boxes of the network;
- deriving set top box signatures from the zapping log and broadcast schedule;
- clustering viewer profiles into groups of viewer profiles using the set top box signatures; and
- associating at least one set top box within the network with at least one viewer profile.
22. The system of claim 17, wherein the logic configured to determine what one or more viewer profiles are currently consuming content provided by a set top box within the network, further comprises:
- logic configured to obtain a first input set containing data of at least one set top box signature, wherein the data of the at least one set top box signature further comprises a processed zapping log containing information summarizing viewing habits of at least one set top box within the network;
- logic configured to obtain a second input set, wherein the second input set contains data showing which of one or more viewer profiles are associated with which one or more set top boxes within the network; and
- logic configured to process the first input set and the second input set by performing at least one operation on the at least one set top box signature and the association of one or more viewer profile to one or more set top box, wherein the operation performs the steps of: using set top boxes that are consistent to both the first input set and the second input set, wherein these set top boxes are referred to as consistent set top boxes; and performing calculations to identify which of the profiles associated with each of the consistent set top boxes consumed each content that is included in the set top box signatures of the first set that are associated with the consistent set top boxes.
23. The system of claim 17, wherein if a supervised learning process was performed to derive a list of one or more viewer profiles that are associated with which set top boxes, a select set top box of the set top boxes is chosen and one or more viewer profiles of the list of one or more viewer profiles associated with the select set top box is determined after an association rule derived from the supervised learning is applied to a newly obtained set top box signature for the select set top box.
24. The system of claim 17, wherein if an unsupervised learning process was performed to derive a list of one or more viewer profiles that are associated with which set top boxes a select set top box of the set top boxes is chosen and one or more viewer profiles of the list of one or more viewer profiles associated with the select set top box is determined after an association rule derived from the unsupervised learning is applied to a newly obtained set top box signature for the select set top box.
25. A system for providing real time targeted rating to enable content placement for video audiences, wherein the system comprises a head end having a computer and means for communicating therein, wherein the computer has a first management application stored therein, and the system has a second computer having a second management application, wherein the second management application further comprises:
- logic configured to determine if at least one set top box, located within a network having at least one set top box, is on or off, wherein being on is defined as a set top box having a zapping event occur within a predefined time period;
- logic configured to determine what one or more viewer profiles are currently consuming content provided by a set top box within the network, wherein currently consuming refers to consuming within the predefined period; and
- logic configured to determine targeted rating per a viewer profile that had been identified as currently consuming content via at least one of the set top boxes within the network.
26. The system of claim 25, wherein determining what one or more viewer profiles are currently consuming content provided by a set top box within the network, further comprises determining if a supervised or an unsupervised learning process was performed to derive a list of what one or more viewer profiles are associated with which set top boxes.
27. The system of claim 26, further comprising logic located within at least one specific set top box of set top boxes within the network, wherein the logic is configured to apply an association rule and a list of at least one viewer profile associated with the specific set top box to a newly obtained set top box signature of the specific set top box, wherein the association rule is a set of parameters derived from performing a learning process, applied to a combination of set top box signatures of set top boxes within the network, with a list of set top boxes within the network, and a list of predefined viewer profiles.
28. The system of claim 25, wherein the logic configured to determine what one or more viewer profiles are currently consuming content provided by a set top box within the network, is located within a set top box, and wherein currently is a predefined period.
Type: Application
Filed: Aug 20, 2008
Publication Date: Feb 26, 2009
Applicant: Ads-Vantage, Ltd. (Shoham)
Inventors: Raviv Knoller (Shoham), Alex Paker (Modiin), Anna Litvak-Hinenzon (Hod-Ha Sharon), Reuven Cohen (Rehovot)
Application Number: 12/195,259
International Classification: H04N 7/10 (20060101);