Patents by Inventor Christian Konig

Christian Konig has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Determination of landmarks

Patent number: 9189488

Abstract: Hash values corresponding to a file are processed in windows to determine a minimum hash value for each window. Each window may begin at a minimum hash value determined for a previous window and end after a fixed number of hash values. If a hash value is less than a threshold hash value, it is added to a buffer that is used to store the hash values in sorted order for a current window. If a hash value is greater than the threshold, it is added to another buffer whose hash values are not stored in sorted order. At the end of the current window, the minimum hash value in the first buffer is selected as the landmark for the window. If the first buffer is empty, then the hash values in the other buffer are sorted and the minimum hash value is selected as the landmark for the window.

Type: Grant

Filed: April 7, 2011

Date of Patent: November 17, 2015

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Mark S. Manasse, Arnd Christian König, Paul Adrian Oltean
Incremental visualization for structured data in an enterprise-level data store

Patent number: 8983936

Abstract: The subject disclosure is directed towards simulating query execution to provide incremental visualization for a global data set. A data store may be configured for searching at least a portion of a global data set being stored at an enterprise-level data store. In response to a user-issued query, partial query results are provided to a front-end interface for display to the user. The front-end interface also provides statistical information corresponding to the partial query results in relation to the global data set, which may be used to determine when a current set of query results becomes acceptable as a true/accurate estimate.

Type: Grant

Filed: April 4, 2012

Date of Patent: March 17, 2015

Assignee: Microsoft Corporation

Inventors: Danyel A. Fisher, Arnd Christian König, Steven M. Drucker
PREDICTING DATA COMPRESSIBILITY USING DATA ENTROPY ESTIMATION

Publication number: 20140244604

Abstract: The subject disclosure is directed towards predicting compressibility of a data block, and using the predicted compressibility in determining whether a data block if compressed will be sufficiently compressible to justify compression. In one aspect, data of the data block is processed to obtain an entropy estimate of the data block, e.g., based upon distinct value estimation. The compressibility prediction may be used in conjunction with a chunking mechanism of a data deduplication system.

Type: Application

Filed: February 28, 2013

Publication date: August 28, 2014

Applicant: MICROSOFT CORPORATION

Inventors: Paul Adrian Oltean, Cosmin A. Rusu, Arnd Christian König, Mark Steven Manasse, Jin Li, Sudipta Sengupta, Sanjeev Mehrotra
Click-through prediction for news queries

Patent number: 8719298

Abstract: Described is estimating whether an online search query is a news-related query, and if so, outputting news-related results in association with other search results returned in response to the query. The query is processed into features, including by accessing corpora that corresponds to relatively current events, e.g., recently crawled from news and blog articles. A corpus of static reference data, such as an online encyclopedia, may be used to help determine whether the query is less likely to be about current events. Features include frequency-related data and context-related data corresponding to frequency and context information maintained in the corpora. Additional features may be obtained by processing text of the query itself, e.g., “query-only” features.

Type: Grant

Filed: May 21, 2009

Date of Patent: May 6, 2014

Assignee: Microsoft Corporation

Inventors: Arnd Christian Konig, Michael Gamon, Qiang Wu, Roger P. Menezes, Monwhea Jeng
Sponsored search data structure

Patent number: 8606627

Abstract: A system that facilitates selecting advertisements that match a search query is described herein. The system includes a search query receiver component that receives a search query including keywords. The system also includes a match component that uses an associative data structure to identify in the associative data structure one or more data nodes that are associated in the associative data structure with respective unique keys corresponding to respective one or more hashes of combinations of the keywords in the search query. For each identified data node, the match component selects advertisements associated with bid phrases stored in the identified data node that respectively only include keywords included in the search query.

Type: Grant

Filed: June 12, 2008

Date of Patent: December 10, 2013

Assignee: Microsoft Corporation

Inventors: Arnd Christian König, Martin Miroslavov Markov, Kenneth Ward Church
Estimating document similarity using bit-strings

Patent number: 8594239

Abstract: Each of a plurality of documents is divided into samples. Small bit-strings are generated for selected samples from each of the documents and used to create a sketch for each document. Because the bit-strings are small (e.g., only one, two, or three bits in length), the generated sketches are smaller than the sketches generated using previous methods for generating sketches, and therefore use less storage space. The generated sketches are compared to determine documents that are near-duplicates of one another.

Type: Grant

Filed: February 21, 2011

Date of Patent: November 26, 2013

Assignee: Microsoft Corporation

Inventors: Mark S. Manasse, Arnd Christian König
Incremental Visualization for Structured Data in an Enterprise-level Data Store

Publication number: 20130268520

Abstract: The subject disclosure is directed towards simulating query execution to provide incremental visualization for a global data set. A data store may be configured for searching at least a portion of a global data set being stored at an enterprise-level data store. In response to a user-issued query, partial query results are provided to a front-end interface for display to the user. The front-end interface also provides statistical information corresponding to the partial query results in relation to the global data set, which may be used to determine when a current set of query results becomes acceptable as a true/accurate estimate.

Type: Application

Filed: April 4, 2012

Publication date: October 10, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Danyel A. Fisher, Arnd Christian König, Steven M. Drucker
QUERY PROGRESS ESTIMATION

Publication number: 20130151504

Abstract: The claimed subject matter provides a method for providing a progress estimate for a database query. The method includes determining static features of a query plan for the database query. The method also includes selecting an initial progress estimator based on the static features and a trained machine learning model. The model is trained using static features of a plurality of query plans, and dynamic features of the plurality of query plans. Further, the method includes determining dynamic features of the query plan for each of a plurality of candidate estimators. Additionally, the method includes selecting a revised progress estimator based on the static features, the dynamic features and a trained machine learning model for each of the candidate estimators. The method further includes producing the progress estimate based on the revised progress estimator.

Type: Application

Filed: December 9, 2011

Publication date: June 13, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Christian Konig, Bolin Ding, Surajit Chaudhuri, Vivek Narasayya
DETERMINATION OF LANDMARKS

Publication number: 20120259897

Abstract: Hash values corresponding to a file are processed in windows to determine a minimum hash value for each window. Each window may begin at a minimum hash value determined for a previous window and end after a fixed number of hash values. If a hash value is less than a threshold hash value, it is added to a buffer that is used to store the hash values in sorted order for a current window. If a hash value is greater than the threshold, it is added to another buffer whose hash values are not stored in sorted order. At the end of the current window, the minimum hash value in the first buffer is selected as the landmark for the window. If the first buffer is empty, then the hash values in the other buffer are sorted and the minimum hash value is selected as the landmark for the window.

Type: Application

Filed: April 7, 2011

Publication date: October 11, 2012

Applicant: Microsoft Corporation

Inventors: Mark S. Manasse, Arnd Christian König, Paul Adrian Oltean
ESTIMATING DOCUMENT SIMILARITY USING BIT-STRINGS

Publication number: 20120213313

Abstract: Each of a plurality of documents is divided into samples. Small bit-strings are generated for selected samples from each of the documents and used to create a sketch for each document. Because the bit-strings are small (e.g., only one, two, or three bits in length), the generated sketches are smaller than the sketches generated using previous methods for generating sketches, and therefore use less storage space. The generated sketches are compared to determine documents that are near-duplicates of one another.

Type: Application

Filed: February 21, 2011

Publication date: August 23, 2012

Applicant: Microsoft Corporation

Inventors: Mark S. Manasse, Arnd Christian König
LOCAL SEARCH USING FEATURE BACKOFF

Publication number: 20120158705

Abstract: A local search system is described herein that provides a framework for the integration of various external sources to improve local search ranking. The framework provided by the local search system described herein uses a notion of backoff. The system uses a generalization of the concept of backoff to improve local search results that incorporate a variety of data features. The system can apply backoff in multiple dimensions at the same time to generate features for local search ranking. The system integrates various additional data sources, such as web access logs, driving direction request logs, reviews, and so forth, to quantify popularity and distance (or distance sensitivity) into a framework for local search ranking. Thus, the system provides search results that are more relevant by incorporating a number of data sources into the ranking in a manner that handles abnormalities in the data well.

Type: Application

Filed: December 16, 2010

Publication date: June 21, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Arnd Christian Konig, Klaus L. Berberich, Dimitrios Lymberopoulos
GENERAL PURPOSE CORRECTION OF GRAMMATICAL AND WORD USAGE ERRORS

Publication number: 20120089387

Abstract: Architecture that detects and corrects writing errors in a human language based on the utilization of three different stages: error detection, correction candidate generation, and correction candidate ranking. The architecture is a generic framework for generating fluent alternatives to non-grammatical word sequences in a written sample. Error detection is addressed by a suite of language model related scores and other scores such as parse scores that can identify a particularly unlikely sequence of words. Correction candidate generation is addressed by a lookup in a very large corpus of “correct” English that looks for alternative arrangements of the same or similar words or subsequences of these words in the same context. Correction candidate ranking is addressed by a language model ranker.

Type: Application

Filed: December 7, 2010

Publication date: April 12, 2012

Applicant: Microsoft Corporation

Inventors: Michael Gamon, Christian König
CONTAINMENT COEFFICIENT FOR IDENTIFYING TEXTUAL SUBSETS

Publication number: 20120051657

Abstract: Similarity is determined between documents based on a method for identifying documents that are likely to be based on another document. The method can include the determination of a containment coefficient, which can indicate when a template document is a subset or substantially a subset of another document. Based on this determination, an appropriate document management action can be taken, such as implementing a security policy or modifying the display of messages from a user interface.

Type: Application

Filed: August 30, 2010

Publication date: March 1, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Charles Lamanna, Raja Charu Vikram Kakumani, Vidyaraman Sankaranarayanan, Arnd Christian König
FAST SET INTERSECTION

Publication number: 20110314045

Abstract: Described is a fast set intersection technology by which sets of elements to be intersected are maintained as partitioned subsets (small groups) in data structures, along with representative values (e.g., one or more hash signatures) representing those subsets. A mathematical operation (e.g., bitwise-AND) on the representative values indicates whether an intersection of range-overlapping subsets will be empty, without having to perform the intersection operation. If so, the intersection operation on those subsets may be skipped, with intersection operations (possibly guided by inverted mappings or using a linear scan) performed only on overlapping subsets that may have one or more intersecting elements.

Type: Application

Filed: June 21, 2010

Publication date: December 22, 2011

Applicant: Microsoft Corporation

Inventors: Arnd Christian König, Bolin Ding
Leveraging cross-document context to label entity

Patent number: 7970808

Abstract: Entities, such as people, places and things, are labeled based on information collected across a possibly large number of documents. One or more documents are scanned to recognize the entities, and features are extracted from the context in which those entities occur in the documents. Observed entity-feature pairs are stored either in an in-memory store or an external store. A store manager optimizes use of the limited amount of space for an in-memory store by determining which store to put an entity-feature pair in, and when to evict features from the in-memory store to make room for new pairs. Feature that may be observed in an entity's context may take forms such as specific word sequences or membership in a particular list.

Type: Grant

Filed: May 5, 2008

Date of Patent: June 28, 2011

Assignee: Microsoft Corporation

Inventors: Arnd Christian Konig, Venkatesh Ganti
QUERY CLASSIFICATION USING SEARCH RESULT TAG RATIOS

Publication number: 20110125791

Abstract: Techniques are described herein for classifying a search query with respect to query intent using search result tag ratios. A tag is a character or a combination of characters (e.g., one or more words) that indicates a property of a document, such as a topic of the document, a type of entity (i.e., subject matter) the document references, etc. A search result tag ratio is defined as a fraction (e.g., a proportion, a percentage, etc.) of the documents in a search result that includes a respective tag. A search query may be classified based on back-off ratios, which are tag ratios of search queries that are related to the search query to be classified. Tag ratios may be pre-computed (i.e., calculated before the corresponding search queries are received from users).

Type: Application

Filed: November 25, 2009

Publication date: May 26, 2011

Applicant: Microsoft Corporation

Inventors: Arnd Christian Konig, Venkatesh Ganti, Xiao Li
Reducing human overhead in text categorization

Patent number: 7894677

Abstract: A unique multi-stage classification system and method that facilitates reducing human resources or costs associated with text classification while still obtaining a desired level of accuracy is provided. The multi-stage classification system and method involve a pattern-based classifier and a machine learning classifier. The pattern-based classifier is trained on discriminative patterns as identified by humans rather than machines which allow a smaller training set to be employed. Given humans' superior abilities to reason over text, discriminative patterns can be more accurately and more readily identified by them. Unlabeled items can be initially processed by the pattern-based classifier and if no pattern match exists, then the unlabeled data can be processed by the machine learning classifier. By employing the classifiers in this manner, less human involvement is required in the classification process. Even more, classification accuracy is maintained and/or improved.

Type: Grant

Filed: February 9, 2006

Date of Patent: February 22, 2011

Assignee: Microsoft Corporation

Inventors: Arnd Christian König, Eric D. Brill
CLICK-THROUGH PREDICTION FOR NEWS QUERIES

Publication number: 20100299350

Abstract: Described is estimating whether an online search query is a news-related query, and if so, outputting news-related results in association with other search results returned in response to the query. The query is processed into features, including by accessing corpora that corresponds to relatively current events, e.g., recently crawled from news and blog articles. A corpus of static reference data, such as an online encyclopedia, may be used to help determine whether the query is less likely to be about current events. Features include frequency-related data and context-related data corresponding to frequency and context information maintained in the corpora. Additional features may be obtained by processing text of the query itself, e.g., “query-only” features.

Type: Application

Filed: May 21, 2009

Publication date: November 25, 2010

Applicant: Microsoft Corporation

Inventors: Arnd Christian Konig, Michael Gamon, Qiang Wu, Roger P. Menezes, Monwhea Jeng
Database configuration analysis

Patent number: 7805443

Abstract: To determine a configuration for a database system, a plurality of queries may be sampled from a representative workload using statistical inference to compute the probability of correctly selecting one of a plurality of evaluation configurations. The probability of correctly selecting may determine which and/or how many queries to sample, and/or may be compared to a target probability threshold to determine if more queries must be sampled. The configuration from the plurality of configurations with the lowest estimated cost of executing the representative workload may be determined based on the probability of selecting correctly. Estimator variance may be reduced through a stratified sampling scheme that leverages commonality, such as an average cost of execution, between queries based on query templates. The applicability of the Central Limit Theorem may be verified and used to determine which and/or how many queries to sample.

Type: Grant

Filed: January 20, 2006

Date of Patent: September 28, 2010

Assignee: Microsoft Corporation

Inventors: Arnd Christian Konig, Shubha Umesh Nabar
Air humidification for fuel cell applications

Patent number: 7651799

Abstract: A system and method for improving air humidification for fuel cell applications includes a fuel cell stack having a cathode inlet and a cathode outlet. The cathode inlet receives an oxidant. A humidifier humidifies the oxidant prior to delivery of the oxidant to the cathode inlet. An injection nozzle is provided, and a volume of water substantially vaporized by the injection nozzle reduces a temperature of the oxidant and increases a water transfer rate of the humidifier. The injection nozzle can be positioned either directly upstream of the humidifier in the oxidant inlet line or in a stack cathode outlet line which is directed into the humidifier.

Type: Grant

Filed: December 20, 2004

Date of Patent: January 26, 2010

Inventors: Detlef Günther, Christian König, John Ruhl

prev 1 2 3 4 next