Patents by Inventor Joshua Goodman

Joshua Goodman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Exponential priors for maximum entropy models

Publication number: 20050165580

Abstract: The subject invention provides for systems and methods that facilitate optimizing one or mores sets of training data by utilizing an Exponential distribution as the prior on one or more parameters in connection with a maximum entropy (maxent) model to mitigate overfitting. Maxent is also known as logistic regression. More specifically, the systems and methods can facilitate optimizing probabilities that are assigned to the training data for later use in machine learning processes, for example. In practice, training data can be assigned their respective weights and then a probability distribution can be assigned to those weights.

Type: Application

Filed: January 28, 2004

Publication date: July 28, 2005

Inventor: Joshua Goodman
Order-based human interactive proofs (HIPs) and automatic difficulty rating of HIPs

Publication number: 20050066201

Abstract: The present invention involves a system and method that facilitate identifying human interaction by utilizing HIPs such as order-based HIPs and determining a difficulty rating of any type of HIPs in an automated fashion. Order-based HIPs require a user to identify elements in the sequence as well as to identify a correct order of the elements in the sequence. The invention involves presenting a user with at least two HIPs such that the HIP can be of known and/or unknown difficulty. A user that correctly answers the HIP of known difficulty gains access to the HIP-controlled resource, action or application. The user's response to the HIP of unknown difficulty can then be examined and employed to determine whether that HIP is too difficult for humans to solve. Alternatively, at least one HIP can be presented. Difficulty of individual HIP parameters can also be determined.

Type: Application

Filed: September 23, 2003

Publication date: March 24, 2005

Inventors: Joshua Goodman, Robert Rounthwaite
Prevention of outgoing spam

Publication number: 20050021649

Abstract: The subject invention provides for a system and method that facilitates detecting and preventing spam in a variety of networked communication environments. In particular, the invention provides several techniques for monitoring outgoing communications to identify potential spammers. Identification of potential spammers can be accomplished at least in part by a detection component that monitors per sender at least one of volume of outgoing messages, volume of recipients, and/or rate of outgoing messages. In addition, outgoing messages can be scored based at least in part on their content. The scores can be added per message per sender and if the total scores) per message or per sender exceeds some threshold, then further action can be taken to verify whether the potential spammer is a spammer. Such actions include human-inspecting a sample of the messages, sending challenges to the account, sending a legal notice to warn potential spammers and/or shutting down the account.

Type: Application

Filed: June 20, 2003

Publication date: January 27, 2005

Inventors: Joshua Goodman, Robert Rounthwaite, Eliot Gillum
Origination/destination features and lists for spam prevention

Publication number: 20050022008

Abstract: The present invention involves a system and method that facilitate extracting data from messages for spam filtering. The extracted data can be in the form of features, which can be employed in connection with machine learning systems to build improved filters. Data associated with origination information as well as other information embedded in the body of the message that allows a recipient of the message to contact and/or respond to the sender of the message call be extracted as features. The features, or a subset thereof, can be normalized and/or deobfuscated prior to being employed as features of the machine learning systems. The (deobfuscated) features can be employed to populate a plurality of feature lists that facilitate spam detection and prevention. Exemplary features include an email address, an IP address, a URL, an embedded image pointing to a URL, and/or portions thereof.

Type: Application

Filed: June 4, 2003

Publication date: January 27, 2005

Inventors: Joshua Goodman, Robert Rounthwaite, Daniel Gwozdz, John Mehr, Nathan Howell, Micah Rupersburg, Bryan Starbuck
Advanced URL and IP features

Publication number: 20050022031

Abstract: Disclosed are systems and methods that facilitate spam detection and prevention at least in part by building or training filters using advanced IP address and/or URL features in connection with machine learning techniques. A variety of advanced IP address related features can be generated from performing a reverse IP lookup. Similarly, many different advanced URL based features can be created from analyzing at least a portion of any one URL detected in a message.

Type: Application

Filed: May 28, 2004

Publication date: January 27, 2005

Applicant: Microsoft Corporation

Inventors: Joshua Goodman, Robert Rounthwaite, Geoffrey Hulten, John Deurbrouck, Manav Mishra, Anthony Penta
Obfuscation of spam filter

Publication number: 20050015454

Abstract: The subject invention provides systems and methods that facilitate obfuscating a spam filtering system to hinder reverse engineering of the spam filters and/or to mitigate spammers from finding a message that consistently gets through the spam filters almost every time. The system includes a randomization component that randomizes a message score before the message is classified as spam or non-spam so as to obscure the functionality of the spam filter. Randomizing the message score can be accomplished in part by adding a random number or pseudo-random number to the message score before it is classified as spam or non-spam. The number added thereto can vary depending on at least one of several types of input such as time, user, message content, hash of message content, and hash of particularly important features of the message, for example. Alternatively, multiple spam filters can be deployed rather than a single best spam filter.

Type: Application

Filed: June 20, 2003

Publication date: January 20, 2005

Inventors: Joshua Goodman, Robert Rounthwaite, John Platt
System and method for providing an audio challenge to distinguish a human from a computer

Publication number: 20040254793

Abstract: An “audio challenger” operates by first defining a library of a finite number of discrete audio objects including spoken sounds, such as, for example, individual digits, letters, numbers, words, etc., or combinations of two or more digits, letters, numbers, or words. The spoken sounds are either automatically generated by a computer, or recorded from one or more actual spoken voices. Given this library of audio objects, the audio challenger automatically selects one or more audio objects from the library and concatenates the objects into an audio string that is then automatically processed to add one or more distortions to create a “challenge string.” The distorted challenge string is then presented to an unknown party for identification. If the unknown party correctly identifies the challenge string, then the unknown party is deemed to be a human operator. Otherwise, the unknown party is deemed to be another computer.

Type: Application

Filed: June 12, 2003

Publication date: December 16, 2004

Inventors: Cormac Herley, James Garnet Droppo, Joshua Goodman, Josh Benaloh, Iulian Calinov, Jeff Steinbok
Cluster and pruning-based language model compression

Patent number: 6782357

Abstract: Cluster- and pruning-based language model compression is disclosed. In one embodiment, a language model is first clustered, such as by using predictive clustering. The language model after clustering has a larger size than it did before clustering. The language model is then pruned, such as by using entropy-based techniques, such as Rosenfeld pruning, or by using Stolcke pruning or count-cutoff techniques. In one particular embodiment, a word language model is first predictively clustered by a technique described as P(Z|xy)×P(z|xyZ), where a lower-case letter refers to a word, and an upper-cluster letter refers to a cluster in which the word resides.

Type: Grant

Filed: May 4, 2000

Date of Patent: August 24, 2004

Assignee: Microsoft Corporation

Inventors: Joshua Goodman, Jianfeng Gao
Method and apparatus for fast machine training

Patent number: 6697769

Abstract: A method and apparatus are provided that reduce the training time associated with machine learning systems whose training time is proportional to the number of outputs being trained. Under embodiments of the invention, the number of outputs to be trained is reduced by dividing the objects to be modeled into classes. This produces at least two sets of model parameters. At least one set describes some aspect of the classes given some context, and at least one other set of parameters describes some aspect of the objects given a class and the context. Thus, instead of training a system with a large number of outputs, corresponding to all of the objects, the present invention trains at least two models, each of which has a much smaller number of outputs.

Type: Grant

Filed: January 21, 2000

Date of Patent: February 24, 2004

Assignee: Microsoft Corporation

Inventors: Joshua Goodman, Robert Moore
Fuzzy keyboard

Patent number: 6654733

Abstract: Fuzzy keyboards, to determine a most-likely-to-be-intended keystroke or keystrokes, are disclosed. In one embodiment, a method adds each of one or more keys to each of a current list of key sequence hypotheses, to create a new list of key sequence hypotheses. The method determines a likelihood probability for each hypothesis in the new list, and removes any hypothesis failing to satisfy any of one or more thresholds. The most likely key sequence of the new list may then be displayed. Some embodiments of the invention relate specifically to soft keyboards, while other embodiments relate specifically to real, physical and hard keyboards.

Type: Grant

Filed: January 18, 2000

Date of Patent: November 25, 2003

Assignee: Microsoft Corporation

Inventors: Joshua Goodman, Daniel Venolia, Xuedong Huang
Caching techniques for streaming media

Publication number: 20030217113

Abstract: A streaming media caching mechanism and cache manager efficiently establish and maintain the contents of a streaming media cache for use in serving streaming media requests from cache rather than from an original data source when appropriate. The cost of caching is incurred only when the benefits of caching are likely to be experienced. The caching mechanism and cache manager evaluate the request count for each requested URL to determine whether the URL represents a cache candidate, and further analyze the URL request rate to determine whether the content associated with the URL will be cached. In an embodiment, the streaming media cache is maintained with a predetermined amount of reserve capacity rather than being filled to capacity whenever possible.

Type: Application

Filed: April 8, 2002

Publication date: November 20, 2003

Applicant: Microsoft Corporation

Inventors: Ariel Katz, Yifat Sagiv, Guy Friedel, David E. Heckerman, John R. Douceur, Joshua Goodman
Predictive keyboard

Patent number: 6573844

Abstract: Predictive keyboards, such as predictive soft keyboards, are disclosed. In one embodiment, a computer-implemented method predicts at least one key to be entered next within a sequence of keys. The method displays a soft keyboard where the predicted keys are displayed on the soft keyboard differently than the other keys on the keyboard. For example, the predicted keys may be larger in size on the soft keyboard as compared to the other keys. This makes the predicted keys more easily typed by a user as compared to the other keys.

Type: Grant

Filed: January 18, 2000

Date of Patent: June 3, 2003

Assignee: Microsoft Corporation

Inventors: Daniel Venolia, Joshua Goodman, Xuedong Huang, Hsiao-Wuen Hon

prev 1 2 3