SYSTEM AND METHOD FOR DETERMINING PREFERENCES FROM INFORMATION MASHUPS
A system and method for determining preferences from information mashups and, in particular, for determining preferences from cross-modality information based on a social welfare function is disclosed. An exemplary embodiment of the invention uses a social welfare function (SWF) to identify a vote computing method from among a group of vote computing methods. The SWF embodies subjective values, e.g. business objectives. The embodiment uses the SWF to identify the vote computing method that combines cross-modality information into a single information mashup in a manner that is most congruent with the subjective values relative to the other vote computing methods. The information mashup may be in the form of a single, merged ranked list.
Latest IBM Patents:
This application claims the benefit of and incorporates by reference in its entirety U.S. provisional application No. 61/041,128, which was filed on Mar. 31, 2008.
FIELD OF INVENTIONThe present invention relates to information mashups, and in particular to a system and method for determining preferences from cross-modality information mashups.
BACKGROUNDThrough the advances of technology, today's world has become inundated with information. One continuing technological and societal challenge is finding methods and systems to extract and combine useful data, knowledge, and understanding from a pool of information that is constantly growing in quantity and increasing in granularity.
Even when we narrow our analysis to one domain of interest, e.g. ranking wines, how do we combine all the information indicating preferences within the domain when the information is available from multiples sources and the sources differ in modality? For example, how do we combine multiple lists of preferences, e.g., from different online communities, sales numbers from different stores, etc? How do we combine the information in a manner that will reveal the aspects of that information that are important, valuable, significant to an entity (e.g., a machine, business, customer, end-user, etc.) requesting the results? And how do we enable tuning of the outcome, e.g., at the touch of a button, to target certain characteristics and elevate those characteristics to the forefront?
SUMMARY OF THE INVENTIONA computer-implemented method for determining preferences from cross-modality information mashups is provided. The method includes receiving a social welfare function (SWF) and identifying two or more vote computing methods. For each of the two or more vote computing methods, the method uses the vote computing method to combine information on preferences into a combined list ranking the preferences. The information is from a set of two or more sources. The set is heterogeneous in modality. For each combined list, the method inputs the combined list into the SWF to compute a score. The method outputs the combined list of the vote computing method associated with the highest score. The set of two or more sources may include data from websites indicating preferences within a certain domain of interest. The information from the set of two or more sources may include structured data from a first source and unstructured data from a second source. The number of preferences being ranked may be at least an order magnitude more in number than the number of sources.
A computer program product for determining preferences from cross-modality information is also provided. The computer program product includes a computer readable medium and program instructions. The program instructions include first program instructions to identify two or more vote computing methods and second program instructions to, for each of the two or more vote computing methods, use the vote computing method to combine information on preferences into a combined list ranking the preferences. The information is from a set of two or more sources. The set is heterogeneous in modality. The program instructions further include third program instructions to compute a score, and fourth program instructions to output the vote computing method associated with the highest score. The program instructions may also include fifth program instructions to output the combined list of the vote computing method associated with the highest score. The two or more sources may include an online blog, an online forum, and/or an online social networking website. The social welfare function may be selected from the group consisting of: Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule.
A system for determining preferences from cross-modality information is further provided. The system includes a communications interface, memory storing computer usable program code; and a processor coupled to the communications interface to receive information on preferences from an external device and coupled to the memory to execute the computer usable program code stored on the memory. The computer usable program code includes computer usable program code configured to identify two or more vote computing methods; computer usable program code configured to, for each of the two or more vote computing methods, use the vote computing method to combine the information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality; computer usable program code configured to, for each combined list, input the combined list into a social welfare function to compute a score; and computer usable program code configured to identify the vote computing method associated with the highest score. The computer usable program code may further include computer usable program code configured to output the combined list of the vote computing method associated with the highest score.
The present invention provides a system and method for determining preferences from information mashups. An information mashup combines or mixes information or data from a multitude of often-conflicting sources into a single representation. For example, for any given domain of interest, opinions can be expressed in many places and collected by many sources. Online sources for people's opinion on a wide range of topics include, for example, blogs, discussion forums and social networking sites. Embodiments of the invention combine information gathered from across different sources, including in one application various online sources, to form a unified, focused view of a community's interests regarding that domain.
An exemplary embodiment of the invention determines preferences from cross-modality information mashups. In more traditional information integration scenarios, systems compare things with identical modalities, such as number of sales from different sources. However there are many domains of interests (e.g., patient preferences, drugs for certain medical conditions, cars, wine, financial products (stocks, bonds, etc.), consumer goods, cameras, computers, books, etc.) where information is available from many different modalities (e.g., comments, passive listens, sales, hits on a website, creation of new website, views on television, etc.). In the domain of books, the following information may be available for determining book preferences: book sales and returns, lists of books read, library checkouts, comments on books read (e.g., online, in newspapers, in magazines, on television or radio), etc. An exemplary embodiment of the invention determines preferences from information mashups constructed from information and data from a set of sources heterogeneous in modality. For example, say we want to combine different on-line data to generate a list of wines. One source of preferences may be generated from sales numbers of wines. Another source may be a list generated from wine tasters. Yet another may be generated by professionals at a wine magazine. Yet another may be generated from counts of comments users post on a wine aficionado site. There are many more sales of wines than posts on a website. Many people buy wines whereas composing a review takes more time and may indicate more interest in a particular vintage. Ultimately, a good cross-modality mashup combines these multiple sources, which indicate interest in all the same underlying subject matter, without allowing one source to unduly influence the combined/consensus list.
Yet, how can one combine the data from the various sources when they are heterogeneous in modality? Comparing different modalities is akin to comparing apples and oranges. How does one determine overall rankings for certain wines, for example, based on the combination of data on sales, written reviews, returns, website polls, etc.? How do you combine data indicating that the reviewer loves a certain wine (glowing reviews), but the public hates it (e.g., by ranking it low on wine.com or low sales)? Do we decide that ten times as many posts on a website reflect ten times as much interest in an event or item? There is a fair amount of subjectivity in how these combinations occur and it is not typically clear how to combine all these sources.
In systems that compare things with identical modalities, using a plurality type voting system makes sense. Plurality type voting systems are those that add together the number of votes from each source and simply adjudicate the winner based on whomever or whichever candidate has the most votes. Plurality type voting systems include systems in which votes are weighted. However, plurality type voting systems have deficiencies when combining information gathered from multiple sources with differing modalities. This can occur, for example, when there are large differences in the numbers returned by sources or when the values measured to derive those numbers indicate very different things.
To identify which of a multitude of combination techniques (including plurality type voting techniques) is optimal for combining data from various sources in a certain instance, embodiments of the invention use a construct known as a social welfare function (SWF). A SWF is a mapping from allocations of goods or rights among people to real numbers. The SWF construct was a tool introduced by Abram Bergson in 1938. The SWF construct allows for the determination of a society's taste for different economic states. There are two features to the SWF construct: first, it imposes a structure and second, it devises a single constitutional/voting system that changes the rankings of the individual into a single society ranking. A SWF might describe, for example, the preferences of an individual over social states, or might describe, as another example, outcomes of an allocation process, whether or not individuals had preferences over those outcomes. Examples of SWFs are the Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule. Thus, using an SWF, a method is supplied for embodying subjectiveness, such as those described above, into one function. Using a custom constructed or selected SWF, embodiments of the invention can capture, for example, business goals in a semi-heuristic way, objectively evaluate various preference combination techniques, and identify which of the combinations techniques to use in a specific instance.
In one exemplary application, the combination techniques include techniques that originate from vote computing or vote counting systems, such as a Borda count method or the Nauru method. Embodiments of the invention may supplement or modify a vote computing or vote counting technique depending on whether the original information expressing the preferences is, for example, structured or unstructured, numerical or textual, etc.
In one exemplary embodiment, the combination technique used is as describe in co-pending U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2), and filed on ______. Accordingly, the system and method for determining preferences from information mashups described in detail herein compliments the system and method described in the co-pending U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2). In one use, the system or method described in detail herein may be used in conjunction with the system or method described in detail in the co-pending application. In another use, the first system and method may be used separately from the latter
In embodiments of the invention, the SWF takes as input a “final” ranked list generated from each of the various vote counting/computing methods and/or systems, and the preferences of each source. The “final” ranked list may be generated using, for example, weighted voting systems, semi-proportional methods, delegates, Borda Count, inverted rank, run off, round robin, and/or a ranking method described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2). The SWF outputs a number that indicates how happy or satisfied the “society” of sources is with the results. Thus, in one application, multiple methods of combining are examined and evaluated, and the combining method that returns the highest SWF value is considered the “best” method. That combining method is then established as the combining method that application will use when determining preferences from future mashups combining information for those sources for those business purposes, for example. As discussed below, reevaluation of the combining method may be done periodically to optimize the quality of the results.
The present disclosure differs from traditional work in the field in several ways. For example, the disclosure addresses situations where, as noted, people are providing preferences in non-uniform ways (complaints, purchase, opinion posted, time, etc.). In such situations, ad hoc weights don't work well because ad hoc weights can only adjust for the deficiencies that exist at a simple point in time. Consider the use of ad hoc weights in combining top-10 lists from Amazon.com and Barnes & Noble in 1995. In that year, Amazon.com ranks should be weighted lower (having less weight in the over scheme of the analysis) than Barnes & Noble because Amazon.com opened its online store in July 1995. If the top-10 lists were compared today, the weights would differ. Thus, although ad hoc weights are useful when combining lists of preferences at one point in time, they need adjusting each time new data from the sources are recombined to account for, e.g., changes in the market, business cycles, seasons, time of day, new product releases (which could, for example, skew statistics for a few days), blitz marketing campaigns, events (e.g., Olympics® or Super Bowl®), etc. These real world changes have the potential of causing dramatic shifts in the rankings being reported. The ad hoc weights adjustments are time-dependent. If we calculate the rankings at a different point in time, the weights would be reconsidered and changed, tuned each time we calculate the rankings. This can be particularly onerous depending on how often the combined rankings are calculated (in real time, daily, weekly, monthly, quarterly, annually, etc.) particularly if the tuning is done without the assistance of any computer-implemented algorithms.
In contrast, embodiments of the invention identify the most appropriate method to combine preferences from sources of different modalities by using a SWF appropriate for predefined objectives (e.g., business requirements). Thus, in analyzing and combining information on preferences, exemplary embodiments take into account, for example, business requirements to a level of granularity that ad hoc weights cannot.
Additionally, embodiments of the invention examine domains with orders of magnitude more “candidates” than “voters”, the reverse of most elections. Conventional voting techniques do not examine scenarios in which the number of “candidates” is orders of magnitude more than the number of “voters.” For example, the Borda function is intended for use in situations when there are a large number of voters and a small number of candidates, such as in a presidential election. Accordingly, embodiments of the invention examine vote computing techniques that are intended for use in scenarios in which the number of items being ranked (or “candidates”) is orders magnitude more than the number of sources ranking the items (“voters”), the opposite of convention elections. Thus, such a vote computing technique may combine the information on preferences into a combined list ranking the preferences in an application in which the number of preferences being ranked is at least an order of magnitude more in number than the number of sources.
FEATURES OF EXEMPLARY EMBODIMENTExemplary embodiments of the invention determine preferences from cross-modality information mashups based on a constructed or selected social welfare function (SWF).
The information 1010 includes information from multiples sources (e.g., Source1, Source2, Source3, etc.). The multiple sources are of varying modalities. Modalities may be expressed as having two major dimensions: intentional versus unintentional, and consuming (passive) versus producing (creative). Intentional activities are those where a user, for example, has had to take steps to “make their mark.” Examples in the online arena would be navigating to a particular page or typing in a name into a search bar. Intentional activities are stronger indicators of interest than unintentional activities. Creative, producing activities are, for example, those where the user takes the time to author a post or compose a response. Passive, consuming activities may involve watching or reading something created by someone else. Creative activities, taking more time and attention, indicate more interest than passive activities.
In
The information 1010 is communicated to each of vote computing methods (e.g., Vote computing method1, Vote computing method2, etc.). In
In
Accordingly, the SWF may be selected or custom constructed to fit the situation. In one embodiment, the SWF is selected from among a set of SWF, e.g., a set including the Precision Optimal Aggregation SWF (Pswf) and the Spearman Footrule SWF (Sswf). The Pswf measures how many items from each source's top-n list are in the “final” ranked list (the single list which merges ranked items from each source). For example, in one application, the Pswf measures how many artists from each source's top-10 list are in an overall top-10 list created using Borda count technique. One exemplary embodiment uses a Precision Optimal Aggregation SWF defined as:
Pswf=ΣSmin(2*|TS∩T|,10),
-
- for top-10 lists TS for each source and top-10 list T overall.
The Spearman Footrule SWF (Sswf) emphasizes preservation of position in the rankings. The Sswf is an approximation of a related SWF Kendall tau distance. The Sswf is less computationally intensive (minutes versus days) relative to the Kendall tau distance. One exemplary embodiment uses a Spearman Footrule SWF defined as:
Sswf=ΣSΣ10a=1max(10−|ra−ras|,0).
In use, the SWF takes as input a “final” ranked list and the preferences of each source. The outcome is a score where points are awarded for increased social welfare of a ranking system. In this way, embodiments quantitatively measure the “happiness” of each contributing source with the overall “final” ranking. As shown in
In each of
The graphs in the CS column express the contribution to the combined ranking for the artist from each source. In the columns labeled CS, the bars from left to right in each of those cells correspond to the sources in
The bottom of each table in
The examples illustrated by
Accordingly, using the Precision Optimal Aggregation SWF, the Semi-Proportional vote counting method is identified among the four as the combining technique that produces a combined list most congruent with the subjective values embodied by the Precision Optimal Aggregation SWF.
Accordingly, using the Spearman Footrule SWF, the Weighted Votes vote counting method is identified as the combining technique among the four that produces a combined list most congruent with the subjective values embodied by the Spearman Footrule SWF.
Accordingly, embodiments of the invention include a method that includes identifying a vote computing method that produces the highest SWF score. The evaluation of which voting computing method is most appropriate for a given set of objectives (e.g., business objectives) is performed by the SWF. The SWF takes the lists of the voters' preferences (the lists from the various sources), along with the outcome of the vote (the combined/consensus list), and produces for each vote computing method a “score” indicating the “satisfaction” in the outcome. The highest score indicates the highest satisfaction. That is, the vote computing method that elevates/accounts for/values those characteristics that are valued by the business objectives (as modeled using the SWF) in an optimal fashion is the vote computing method that gets the highest score from the SWF. Since an embodiment of the invention will output a final combined list based on the SWF, the quality of the output is affected by how well the SWF embodies the subjective values driving the undertaking and incorporates those values into an objective function.
In an exemplary embodiment, to improve the quality of the system or method's output, a large collection of voting methods or combination techniques are enumerated and examined, over multiple time periods of sample data, to identify which voting method or combination technique produces the highest SWF score. An exemplary embodiment examines the results of various “voting” methods using several weeks or months worth of data.
For parameterized voting techniques (such as the technique described in U.S. patent application Ser. No. ______ (having attorney docket number ARC920080029US2), parameter(s) may also be optimized as well to improve the quality of the system. For example, in use, embodiments may determine (e.g., by searching for or computing) a parameter value that is most congruent with enabling the parameterized voting method to output a combined list that reflects the values, e.g., the business objectives.
Moreover, the characteristics of many sources change over time. Thus, in an exemplary embodiment, even after a vote counting method is established, the congruency of the method to the business objectives (as embodied by the SWF) is revisited periodically (e.g., quarterly) to make sure that changes in the underlying data sources have not reduced the quality of the results. In certain applications, the additional optimization techniques described are also repeated periodically.
Thus, an exemplary embodiment of the invention applies voting theory to cross-modality information mashups to construct a combined list ranking preferences. An SWF is used to select from various voting methods based on data from various cross-modality sources. In use, the sources and associated data are dependent on the domain. For example, in the domain of interest of wine, the source or associated data may be results of wine tasting parties, professional reviews (e.g., scores from 1-10 in different categories), sales, change in sales, comments posted by average users, and mentions in mass media.
At 4060, for each combined list, the combined list is inputted into the SWF to compute a score. The score indicates congruency between the combined list and value(s) embodied by the SWF, e.g., a business objective. At 4070, the combined list of the vote computing method associated with the highest score is outputted. In one embodiment, additionally or alternatively, the vote computing method associated with the highest score is outputted.
Although labeled with the numbers above, it should be understood that embodiments of this invention may execute the method 4000 and/or the method 5000 in a non-sequential order as appropriate and still remain in accordance with the invention. For example, although numbered 4010, embodiments of the present invention may receive the social welfare function before, during, or after 4020, 4030, 4040, and/or 4050. Similarly, although numbered 5010, embodiments of the present invention may create the social welfare function before, during, or after 5020 and/or 5030.
Moreover, while ranked lists are described in detail herein, in other embodiments, the sources provide additional information on preference (such as total numbers) for input into voting systems that can make use of such additional information.
Embodiments of the invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment containing both hardware and software elements. In a preferred embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, and microcode.
Furthermore, the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer-readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The medium can be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device). Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk, or an optical disk. Current examples of optical disks include compact disk-read-only memory (CD-ROM), compact disk-read/write (CD-R/W), and DVD.
A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices) can be coupled to the system either directly or through intervening I/O controllers.
Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modems and Ethernet cards are just a few of the currently available types of network adapters.
The computer system can include a display interface 48 that forwards graphics, text, and other data from the communication infrastructure 46 (or from a frame buffer not shown) for display on a display unit 50. The computer system also includes a main memory 52, preferably random access memory (RAM), and may also include a secondary memory 54. The secondary memory 54 may include, for example, a hard disk drive 56 and/or a removable storage drive 58, representing, for example, a floppy disk drive, a magnetic tape drive, or an optical disk drive. The removable storage drive 58 reads from and/or writes to a removable storage unit 60 in a manner well known to those having ordinary skill in the art. Removable storage unit 60 represents, for example, a floppy disk, a compact disc, a magnetic tape, or an optical disk, etc. which is read by and written to by removable storage drive 58. As will be appreciated, the removable storage unit 60 includes a computer readable medium having stored therein computer software and/or data.
In alternative embodiments, the secondary memory 54 may include other similar means for allowing computer programs or other instructions to be loaded into the computer system. Such means may include, for example, a removable storage unit 62 and an interface 64. Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 62 and interfaces 64 which allow software and data to be transferred from the removable storage unit 62 to the computer system.
The computer system may also include a communications interface 66. Communications interface 66 allows software and data to be transferred between the computer system and external devices. Examples of communications interface 66 may include a modem, a network interface (such as an Ethernet card), a communications port, or a PCMCIA slot and card, etc. Software and data transferred via communications interface 66 are in the form of signals which may be, for example, electronic, electromagnetic, optical, or other signals capable of being received by communications interface 66. These signals are provided to communications interface 66 via a communications path (i.e., channel) 68. This channel 68 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link, and/or other communications channels.
In this document, the terms “computer program medium,” “computer usable medium,” and “computer readable medium” are used to generally refer to media such as main memory 52 and secondary memory 54, removable storage drive 58, and a hard disk installed in hard disk drive 56.
Computer programs (also called computer control logic) are stored in main memory 52 and/or secondary memory 54. Computer programs may also be received via communications interface 66. Such computer programs, when executed, enable the computer system to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable the processor 44 to perform the features of the computer system. Accordingly, such computer programs represent controllers of the computer system.
In the depicted example, server 104 and server 106 are connected to network 102 along with storage unit 108. In addition, clients 110, 112, and 114 are also connected to network 102. These clients 110, 112, and 114 may be, for example, personal computers, network computers, or the like. In the depicted example, server 104 provides data, such as boot files, operating system images, and applications to clients 110, 112, and 114. Clients 110, 112, and 114 are clients to server 104 in the depicted example. Distributed data processing system 100 may include additional servers, clients, and other devices not shown.
In the depicted example, distributed data processing system 100 is the Internet with network 102 representing a worldwide collection of networks and gateways that use the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another. At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers, consisting of thousands of commercial, governmental, educational and other computer systems that route data and messages. Of course, the distributed data processing system 100 may also be implemented to include a number of different types of networks, such as for example, an intranet, a local area network (LAN), a wide area network (WAN), or the like. As stated above,
In one use, as an example, clients 110 and 112 collect information (e.g., from user input) and provides it to server 104. Server 104 stores the information in storage 108. Server 106 contains hardware devices and software tools to combine the information (e.g., into information mashups and/or combined/consensus lists) according to the present invention. Server 106 transmits the combined information to server 104 and/or clients 110, 112, and/or 114, for example.
In use, client 114 may provide the server with business requirements embodied in a SWF. The server determines to best vote counting method to use for that particular application based on the SWF and the information stored, e.g., in storage 108. The server may transmit an identification of the vote counting method to the client.
References in the claims to an element in the singular is not intended to mean “one and only” unless explicitly so stated, but rather “one or more.” All structural and functional equivalents to the elements of the above-described exemplary embodiment that are currently known or later come to be known to those of ordinary skill in the art are intended to be encompassed by the present claims. No claim element herein is to be construed under the provisions of 35 U.S.C. section 112, sixth paragraph, unless the element is expressly recited using the phrase “means for” or “step for.”
Thus, a system and method for determining preferences from information mashups and, in particular, for determining preferences from cross-modality information mashups based on a social welfare function is disclosed. While the preferred embodiments of the present invention have been described, it will be understood that modifications and adaptations to the embodiments shown may occur to one of ordinary skill in the art without departing from the scope of the present invention as set forth in the claims. Thus, the scope of this invention is to be construed according to the claims and not limited by the specific details disclosed in the exemplary embodiments.
Claims
1. A computer-implemented method for determining preferences from cross-modality information mashups, the method comprising:
- receiving a social welfare function (SWF);
- identifying two or more vote computing methods;
- for each of the two or more vote computing methods, using the vote computing method to combine information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality;
- for each combined list, inputting the combined list into the SWF to compute a score; and
- outputting the combined list of the vote computing method associated with the highest score.
2. The method of claim 1, wherein the set of two or more sources includes data from websites indicating preferences within a certain domain of interest.
3. The method of claim 1, wherein the information from the set of two or more sources includes structured data from a first source and unstructured data from a second source.
4. The method of claim 3, further comprising processing the unstructured data using natural language mining.
5. The method of claim 1, wherein receiving the SWF comprises receiving a custom constructed SWF based on a set of business objectives.
6. The method of claim 1, wherein the two or more vote computing methods includes a parameterized vote computing method.
7. The method of claim 1, wherein the number of preferences being ranked is at least an order of magnitude more in number than the number of sources.
8. A computer program product for determining preferences from cross-modality information, said computer program product comprising:
- a computer readable medium;
- first program instructions stored on the computer readable medium, the first program instructions to identify two or more vote computing methods;
- second program instructions stored on the computer readable medium, the second program instructions to, for each of the two or more vote computing methods, use the vote computing method to combine information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality;
- third program instructions stored on the computer readable medium, the third program instructions to, for each combined list, input the combined list into a social welfare function to compute a score; and
- fourth program instructions stored on the computer readable medium, the fourth program instructions to output the vote computing method associated with the highest score.
9. The computer program product of claim 8, further comprising
- fifth program instructions stored on the computer readable medium, the fifth program instructions to output the combined list of the vote computing method associated with the highest score.
10. The computer program product of claim 8, wherein the information from the set of two or more sources includes structured data from a first source and unstructured data from a second source.
11. The computer program product of claim 10, wherein the second source is selected from the group consisting of: an online blog, an online forum, and an online social networking website.
12. The computer program product of claim 8, wherein the social welfare function is selected from the group consisting of: Bergson-Samuelson, Precision Optimal Aggregation, and Spearman Footrule.
13. The computer program product of claim 8, wherein the social welfare function is a custom constructed social welfare function.
14. The computer program product of claim 8, wherein the two or more vote computing methods includes a parameterized vote computing method.
15. The computer program product of claim 8, wherein the number of preferences being ranked is at least an order magnitude more in number than the number of sources.
16. A system for determining preferences from cross-modality information, the system comprising:
- a communications interface;
- memory storing computer usable program code; and
- a processor coupled to the communications interface to receive information on preferences from an external device and coupled to the memory to execute the computer usable program code stored on the memory; wherein the computer usable program code comprises: computer usable program code configured to identify two or more vote computing methods; computer usable program code configured to, for each of the two or more vote computing methods, use the vote computing method to combine the information on preferences into a combined list ranking the preferences, wherein the information is from a set of two or more sources, and wherein the set is heterogeneous in modality; computer usable program code configured to, for each combined list, input the combined list into a social welfare function to compute a score; and computer usable program code configured to identify the vote computing method associated with the highest score.
17. The system of claim 16, wherein the computer usable program code further comprises:
- computer usable program code configured to output the combined list of the vote computing method associated with the highest score.
18. The system of claim 16, wherein the set of two or more sources includes data from websites indicating preferences within a certain domain of interest.
19. The system of claim 16, wherein the information from the set of two or more sources includes structured data from a first source and unstructured data from a second source.
20. The system of claim 16, wherein the processor is coupled to the communications interface to receive the information on preferences from a server.
Type: Application
Filed: Aug 20, 2008
Publication Date: Oct 1, 2009
Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION (Armonk, NY)
Inventors: Varun Bhagwan (San Jose, CA), Tyrone Wilberforce Andre Grandison (San Jose, CA), Daniel Frederick Gruhl (San Jose, CA), Jan Hendrik Pieper (San Jose, CA)
Application Number: 12/195,126
International Classification: G06F 17/30 (20060101);