Patents by Inventor Kumar Chellapilla

Kumar Chellapilla has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

WEB SEARCHING

Publication number: 20090248657

Abstract: Mislabeled URLs are identified and corrected based upon a click relevance ranking computed from user data comprising user click information. The click relevance ranking is formed by applying a set of relevance ordering rules to user log data aggregated by query and URL and by mapping the results of the relevance ordering rules into a linear ordering. For a given query, the aggregated user log data comprises a relative total number of impression, a relative total number of clicks received and a rank associated with the query/URL pair at the time of the total number of impressions and total number of clicks received. The click relevance ranking is used to identify and correct mislabeled query/URL pairs of other rankings according to a number of disclosed methods.

Type: Application

Filed: March 27, 2008

Publication date: October 1, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Kumar Chellapilla, Anton Mityagin, Xuanhui Wang
Data partitioning via bucketing bloom filters

Publication number: 20080307189

Abstract: Multiple Bloom filters are generated to partition data between first and second disjoint data sets of elements. Each element in the first data set is assigned to a bucket of a first set of buckets, and each element in the second data set is assigned to a bucket of a second set of buckets. A Bloom filter is generated for each bucket of the first set of buckets. The Bloom filter generated for a bucket indicates that each element assigned to that bucket is part of the first data set, and that each element assigned to a corresponding bucket of the second set of buckets is not part of the first data set. Additionally, a Bloom filter corresponding to a subsequently received element can be determined and used to identify whether that subsequently received element is part of the first data set or the second data set.

Type: Application

Filed: June 11, 2007

Publication date: December 11, 2008

Applicant: Microsoft Corporation,

Inventors: Anton Mityagin, Kumar Chellapilla, Denis Charles
Extracting link spam using random walks and spam seeds

Publication number: 20080270549

Abstract: Architecture for extracting link spam communities when given one or more members of the community. A link spam extraction algorithm is provided that takes as input link spam seeds and extracts other nearby link spam through a biased local random walk around the seed(s). The seed set is provided by a user (or an automated algorithm scrubbed by a human) which the algorithm uses to simulate a random walk on a web graph. The random walk can be biased to explore a local neighborhood around the seed set through use of decay probabilities. Truncation can be used to retain only the most frequently visited nodes. After termination, the nodes are sorted in decreasing order of final probabilities and presented to the user. Human judges need only make decisions at the spam community level, thereby limiting involvement, and human input can be scaled by several orders of magnitude.

Type: Application

Filed: April 26, 2007

Publication date: October 30, 2008

Applicant: Microsoft Corporation

Inventors: Kumar Chellapilla, Baoning Wu
ROBUST PERSONALIZATION THROUGH BIASED REGULARIZATION

Publication number: 20070239450

Abstract: The subject disclosure pertains to systems and methods for personalization of a recognizer. In general, recognizers can be used to classify input data. During personalization, a recognizer is provided with samples specific to a user, entity or format to improve performance for the specific user, entity or format. Biased regularization can be utilized during personalization to maintain recognizer performance for non-user specific input. In one aspect, regularization can be biased to the original parameters of the recognizer, such that the recognizer is not modified excessively during personalization.

Type: Application

Filed: April 6, 2006

Publication date: October 11, 2007

Applicant: Microsoft Corporation

Inventors: Wolf Kienzle, Kumar Chellapilla
Robust indexing and retrieval of electronic ink

Publication number: 20070230791

Abstract: A unique system and method that facilitates indexing and retrieving electronic ink objects with improved efficiency and accuracy is provided. Handwritten words or characters are mapped to a low dimension through a process of segmentation, stroke classification using a neural network, and projection along directions found using OPCA, for example. The employment of OPCA makes these low dimensional representations robust to handwriting variations or noise. Each handwritten word or set of characters is stored along with neighborhood hyperrectangle that represents word variations. Redundant bit vectors are used to index the hyperrectangles for efficient storage and retrieval. Ink-based queries can be submitted in order to retrieve at least one ink object. To do so, the ink query is processed to determine its query point which is represented by a (query) hyperrectangle. A data store can be searched for any hyperrectangles that match the query hyperrectangle.

Type: Application

Filed: April 4, 2006

Publication date: October 4, 2007

Applicant: Microsoft Corporation

Inventors: Kumar Chellapilla, John Platt
Allograph based writer adaptation for handwritten character recognition

Publication number: 20070140561

Abstract: The claimed subject matter provides a system and/or a method that facilitates analyzing and/or recognizing a handwritten character. An interface component can receive at least one handwritten character. A personalization component can train a classifier based on an allograph related to a handwriting style to provide handwriting recognition for the at least one handwritten character. In addition, the personalization component can employ any suitable combiner to provide optimized recognition.

Type: Application

Filed: December 19, 2005

Publication date: June 21, 2007

Applicant: Microsoft Corporation

Inventors: Ahmad Abdulkader, Kumar Chellapilla, Patrice Simmard
Logical structure and layout based offline character recognition

Publication number: 20070133883

Abstract: A method and system for implementing character recognition is described herein. An input character is received. The input character is composed of one or more logical structures in a particular layout. The layout of the one or more logical structures is identified. One or more of a plurality of classifiers are selected based on the layout of the one or more logical structures in the input character. The entire character is input into the selected classifiers. The selected classifiers classify the logical structures. The outputs from the selected classifiers are then combined to form an output character vector.

Type: Application

Filed: December 12, 2005

Publication date: June 14, 2007

Applicant: Microsoft Corporation

Inventors: Kumar Chellapilla, Patrice Simard
Optimization of cascaded classifiers

Publication number: 20070112701

Abstract: An optimization system comprises a reception component that receives a cascade of classifiers. The system further includes an optimization component communicatively coupled to the reception component, the optimization component receives input relating to one of speed and accuracy of the cascade of classifiers and optimizes the cascade of classifiers based at least in part upon the received input and confidence scores associated with each classifier within the cascade of classifiers. The optimization component can utilize at least one of a steepest descent algorithm, a dynamic programming algorithm, a simulated annealing algorithm, and a branch and bound variant of a depth first search algorithm in connection with optimizing the cascade of classifiers.

Type: Application

Filed: August 15, 2005

Publication date: May 17, 2007

Applicant: Microsoft Corporation

Inventors: Kumar Chellapilla, Patrice Simard, Michael Shilman
Unfolded convolution for fast feature extraction

Publication number: 20070086655

Abstract: Systems and methods are described that facilitate performing feature extraction across multiple received input features to reduce computational overhead associated with feature processing related to, for instance, optical character recognition. Input feature information can be unfolded and concatenated to generate an aggregated input matrix, which can be convolved with a kernel matrix to produce output feature information for multiple output features concurrently.

Type: Application

Filed: October 14, 2005

Publication date: April 19, 2007

Applicant: Microsoft Corporation

Inventors: Patrice Simard, David Steinkraus, Kumar Chellapilla
Systems and methods that facilitate improved display of electronic documents

Publication number: 20060271846

Abstract: A computer-implemented word processing system comprises an interface component that receives a features vector associated with an electronic document. An analysis component communicatively coupled to the interface component analyzes the features vector and determines a viewing mode in which to display the electronic document. In accordance with one aspect of the subject invention, the viewing mode can be one of a conventional viewing mode and a viewing mode associated with enhanced readability.

Type: Application

Filed: May 24, 2005

Publication date: November 30, 2006

Applicant: Microsoft Corporation

Inventors: Radoslav Nickolov, Kumar Chellapilla, David Bargeron, Patrice Simard, Paul Viola
Scalable hash-based character recognition

Publication number: 20060171588

Abstract: The subject invention leverages a scalable character glyph hash table to provide an efficient means to identify print characters where the character glyphs are identical over independent presentation. The hash table allows for quick determinations of glyph meta data as, for example, a pre-filter to traditional OCR techniques. The hash table can be trained for a particular environment, user, language, character set (e.g., alphabet), document type, and/or specific document and the like. This permits substantial flexibility and increases in speed in identifying unknown glyphs. The hash table itself can be composed of single or multiple tables that have a specific optimization purpose. In one instance of the subject invention, traditional OCR techniques can be utilized to update the hash tables as needed based on glyph frequency. This keeps the hash tables from growing by limiting updates that reduce its performance, while adding frequently determined glyphs to increase the pre-filter performance.

Type: Application

Filed: January 28, 2005

Publication date: August 3, 2006

Applicant: Microsoft Corporation

Inventors: Kumar Chellapilla, Patrice Simard, Radoslav Nickolov
Spatial recognition and grouping of text and graphics

Publication number: 20060045337

Abstract: The present invention leverages spatial relationships to provide a systematic means to recognize text and/or graphics. This allows augmentation of a sketched shape with its symbolic meaning, enabling numerous features including smart editing, beautification, and interactive simulation of visual languages. The spatial recognition method obtains a search-based optimization over a large space of possible groupings from simultaneously grouped and recognized sketched shapes. The optimization utilizes a classifier that assigns a class label to a collection of strokes. The overall grouping optimization assumes the properties of the classifier so that if the classifier is scale and rotation invariant the optimization will be as well. Instances of the present invention employ a variant of AdaBoost to facilitate in recognizing/classifying symbols. Instances of the present invention employ dynamic programming and/or A-star search to perform optimization.

Type: Application

Filed: August 26, 2004

Publication date: March 2, 2006

Applicant: Microsoft Corporation

Inventors: Michael Shilman, Paul Viola, Kumar Chellapilla
Segmentation based content alteration techniques

Publication number: 20050246775

Abstract: The subject invention provides a unique system and method that facilitates creating HIP challenges (HIPs) that can be readily segmented and solved by human users but that are too difficult for non-human users. More specifically, the system and method utilize a variety of unique alteration techniques that are segmentation-based. For example, the system and method employ thicker arcs or occlusions that do not intersect characters already placed in the HIP. The thickness of the arc can be measured or determined by the thickness of the characters in the HIP. In addition to increasing the thickness, the arcs can be lengthened because longer arcs tend to resemble pieces of characters and may be harder to erode. Usability maps can be generated and used to selectively place clutter or occlusions and to selectively warp characters or the character sequence to facilitate human recognition of the characters.

Type: Application

Filed: January 31, 2005

Publication date: November 3, 2005

Applicant: Microsoft Corporation

Inventors: Kumar Chellapilla, Patrice Simard
High performance content alteration architecture and techniques

Publication number: 20050229251

Abstract: The present invention provides a unique system and method that facilitates obtaining high performance and more secure HIPs. More specifically, the HIPs can be generated in part by caching pre-rendered characters and/or pre-rendered arcs as bitmaps in binary form and then selecting any number of the characters and/or arcs randomly to form a HIP sequence. The warp field can be pre-computed and converted to integers in binary form and can include a plurality of sub-regions. The warp field can be cached as well. Any one sub-region can be retrieved from the warp field cache and mapped to the HIP sequence to warp the HIP. Thus, the pre-computed warp field can be used to warp multiple HIP sequences. The warping can occur in binary form and at a high resolution to mitigate reverse engineering. Following, the warped HIP sequence can be down-sampled and texture and/or color can be added as well to improve its appearance.

Type: Application

Filed: March 31, 2004

Publication date: October 13, 2005

Inventors: Kumar Chellapilla, Patrice Simard

prev 1 2