Automatic Mail Rejection Feature
A spam defining system defines rules about emails depending on user's reactions to emails. A user can delete an email as spam, or not spam, or without committing to whether the email is spam or not. If the user indicates whether the email is spam or not spam; characteristics of the email are used to update a database. Incoming emails are compared against the database, to determine a likelihood of whether they are spam.
Latest HARRIS TECHNOLOGY, LLC Patents:
This application is a divisional of Ser. No. 09/690,002, filed Oct. 16, 2000; which claims priority from provisional application No. 60/203,729, filed May 12, 2000, now lapsed.
BACKGROUNDThis invention relates to an automatic mail rejection feature in an e-mail program.
E-mail can be an inexpensive and effective way of sending information. Because of this, a recurrent problem is “spam”, or the sending of unwanted email to a certain person. Once an e-mail address gets on a spammer's list, the person can be barraged with junk email. Various attempts have been made to combat this problem.
For example, some web e-mail programs include the ability to block further mail from a specified sender. When junk mail is received from a specified address, the control is actuated. Further mail from that specified sender is then blocked, presumably automatically deleted or sent to the trash.
Certain laws also cover spamming, and require that each e-mail that is sent unsolicited have a way of unsubscribing from the list. Spammers combat both of these measures by continually changing their name and/or changing their return address.
Some e-mail programs allow a user to manually set criteria for rejection of incoming email. For example, if an incoming e-mail is from a domain that has many known spammers, many people may simply set their program to delete it. However, this has the unintended extra effect of also removing desired email, at times.
In addition, the automatic rejection feature does nothing to resolve the traffic caused by junk e-mail.
SUMMARYThe present application teaches an automatic system which automatically recognizes certain aspects of undesired messages such as junk email and undesired Internet content. The system automatically produces recommendations of criteria to use in automatically removing undesired information.
In an email embodiment described herein, these criteria can be automatically enforced or can be presented to the user as a table of options. In addition, the system can look for keywords in the e-mail, and can automatically postulate strategies for rules based on these keywords.
These and other aspects will be described in detail with reference to the accompanying drawings, wherein:
A first embodiment describes an e-mail program which allows automatic rejection of unwanted messages. The embodiment runs on a computer shown in
The likelihood of spam quotient can be displayed as a number as shown in
One of the buttons 106 on the toolbar requests removal of the high spam likelihood messages from the inbox. This enables, in a single click, removing all high likelihood of spam messages. Another button 120 is an options button which brings up the options menu of
The function buttons in
Another button 112 is also provided indicating “delete the message; not spam”. Therefore, the user is presented with three different options: delete the message without indicating whether it is spam or not, delete the message while indicating that it is spam, or delete the message indicating that it is not spam.
The latter two options are used to update the rules in the rules database as described in further detail herein. Hence, this option allows adding an incoming e-mail message to the spam list, when it is determined to be likely to be spam.
Those messages which are likely not spam are shown in a neutral color such as green or black. The messages which are questionable are shown in a cautionary color, such as yellow highlight. Finally, the messages which are likely to be spam are shown in an alert color such as red.
A second display option displays only those messages which are likely to represent desired messages. Hence, only the green and yellow messages are displayed. According to one embodiment, the messages are sorted by date and time received. Within each day, the messages are sorted by likelihood of being spam. The spam-likely messages, which are determined to be likely to represent spam, may be put into a separate folder; here shown as “spam-likely messages”.
The messages which are likely to represent undesired information can be read by the user. If not read by the user, they are kept in the folder for a specified period of time e.g. thirty days, before deleting.
The incoming messages are processed based on rules. For example, if one does not want to be on a mailing list about XXX type items, then messages that include the text “free xxx pictures” may be likely to be spam. However, other people may find those messages to be highly desirable. Similarly, messages about get rich quick schemes may be trash to one person, treasure to another.
The present system allows customization of which emails to remove as spam, by defining rules. Each time a message is deleted as spam, a number of aspects about that message are stored. A database is used to store the message. This database may include relative weighting of different aspects.
The sender of the message is often a highly determinative factor. For example, if a specific sender sends one spam message, the same sender is very likely to be sending another spam message later on. Therefore, a first item in the database is the “received from” field 202. In addition to the specific sender, however, the domain of the sender often gives information. This domain is reviewed at 204. If the domain is a common domain such as Yahoo.com or Hotmail.com, then the relevance of the sender's domain may not be probative. If, however, the domain name is uncommon, such getrichquick.com or the like, then it is more likely that other message from that domain would be spam. Further, many messages from a common domain may itself be probative. The domain information is weighted accordingly.
The domain name from an item is added to the rules database from field 204. Another field 206 stores an indication of whether the domain is a common domain or an uncommon/specific domain. This determination is initially zero, and is changed depending on the number of hits of domains that become present in the database. For example, when two different addresses from the same domain become spam, then the value becomes presumptly H (likely to be spam). When two different addresses from the same domain are received, one spam, the other not, then the value presumptively becomes L.
Each sentence and field in the e-mail, including subject; text of the body; links in the email, and any others is then stored as a separate field.
Analogous information may also be categorized from emails that are deleted as “not spam”. This provides a database of characteristics that are likely to represent spam messages, and other characteristics that are less likely to represent spam messages. Matching with the databases changes the scoring of the message accordingly.
Once the database becomes sufficiently large, it may become time-consuming to compare incoming messages with the database. Indexing approaches can be used to increase the speed of the comparison. The detailed comparison may also be done in the background; the message may be displayed, and its classification displayed only some time later.
Similarly, the e-mail and its fields can be compared with non-spam indicative email. An e-mail which is not spam can carry negative scores, for example. Finding the e-mail address to be on the non-spam list, for example, can carry a score of negative 100, or can immediately abort the process with an indication of non-spam.
If a message has few matches to the database, it may be characterized as unknown or cautionary (yellow). Similarly, mixed signals (some matches to spam and non-spam database), may result in an unknown result.
The total score for an e-mail is assessed, and this total score is used to assess if the e-mail is spam or not. If the e-mail is determined to be spam, then it is appropriately processed.
Many different rules databases can be used.
Such modifications are intended to be encompassed.
Claims
1. A method, comprising:
- using a computer for obtaining a message;
- monitoring a user's actions on said computer with respect to said message;
- using the user's actions for automatically forming rules indicating the desirability of said message;
- analyzing parts of said message, where said parts include at least at least one hyperlink within said message, and at least information within said message indicative of an internet address within said message; and
- said computer using said rules to assess desirability of other messages.
2. A method as in claim 1, wherein said other messages are electronic mail messages.
3. A method as in claim 1 wherein said user's actions include a specific way that the user deletes said message using the computer.
4. A method as in claim 1 wherein said desirability comprises whether said message is a spam e-mail message.
5. A method, comprising:
- using a computer for determining a plurality of characteristics of an unwanted electronic message;
- using the computer for forming a list with said plurality of characteristics;
- using the computer for receiving an incoming electronic message different than said unwanted electronic message, and forming a score of the incoming message by comparing said incoming message with said list and determining commonalities between said incoming message and said list, wherein said comparing comprises determining a domain of the sender, and comparing said domain of the sender with information about spam messages in the list, to obtain a higher probability of spam when information about all senders from a specific domain in said database represent spam, and to represent a lower probability of spam when some senders from said domain in said database represent spam and other senders from said domain in said database do not represent spam;
- using the computer for defining said incoming message as likely being spam if said score is within a predetermined range; and
- using the computer for taking an action to restrict said message based on said defining.
6. A method as in claim 5, wherein said comparing also comprises determining hyperlinks within said electronic message, and comprises comparing said hyperlinks with hyperlinks within said list.
7. A computer product, comprising a processor and memory, and executable instructions that are adapted to be executed to implement a filter for an e-mail program, said product comprising:
- an email receiving part that receives information indicative of emails that have been received and determines domain information from a sender of one of said emails;
- a storage part that stores information indicative of emails that are known to represent spam, said storage part storing domain information for multiple of said emails that are known to be spam; and
- a comparing part, that compares said received emails to said known spam emails, and determines that a received email represents spam responsive to said domain information in said storage part matching to said domain information from said email receiving part.
8. A product as in claim 7, wherein said comparing part determines a received email as being likely to more likely to represent spam when said domain information matches to an uncommon domain, and as being less likely to represent spam with said domain information matches to a less common domain
9. A product as in claim 8, where said comparing part determines a number of hits to the domain to determine whether the domain is a common domain.
10. A product as in claim 7, wherein said comparing part also determining hyperlinks within said electronic message, and comprises comparing said hyperlinks with hyperlinks within said list.
11. A program as in claim 7, further comprising a display output which displays a likelihood of spam coefficient which indicates a numerical percentage likelihood that message specific email represents spam.
12. A computer product, comprising a processor and memory, and executable instructions that are adapted to be executed to implement a filter for an e-mail program, said product comprising:
- an email receiving part that receives information indicative of emails that have been received and determines information about said emails, one part of said information including information about hyperlinks within said emails that have been received;
- a storage part that stores information indicative of emails that are known to represent spam, said storage part storing information about hyperlinks that are within at least one of said emails that are known to be spam; and
- a comparing part, that compares said received emails to said known spam emails, and determines that a received email represents spam responsive to said hyperlinks in said storage part matching to said hyperlinks from said email receiving part.
13. A product as in claim 12, wherein said comparing part also determines that a received email represents spam responsive to said domain information in said storage part matching to said domain information from said email receiving part.
14. A product as in claim 13, wherein said comparing part determines a received email as being likely to more likely to represent spam when said domain information matches to an uncommon domain, and as being less likely to represent spam with said domain information matches to a less common domain
15. A product as in claim 13, where said comparing part determines number of hits to the domain to determine whether the domain is a common domain.
16. A program as in claim 12, further comprising a display output which displays a likelihood of spam coefficient which indicates a numerical percentage likelihood that message specific email represents spam.
Type: Application
Filed: Feb 19, 2010
Publication Date: Jun 17, 2010
Applicant: HARRIS TECHNOLOGY, LLC (Rancho Santa Fe, CA)
Inventor: Scott C. Harris (Rancho Santa Fe, CA)
Application Number: 12/708,681
International Classification: G06F 15/16 (20060101); G06F 17/30 (20060101);