Patents by Inventor Evgeny Matusov

Evgeny Matusov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

TEXT SEGMENTATION AND LABEL ASSIGNMENT WITH USER INTERACTION BY MEANS OF TOPIC SPECIFIC LANGUAGE MODELS AND TOPIC-SPECIFIC LABEL STATISTICS

Publication number: 20120095751

Abstract: The invention relates to a method, a computer program product, a segmentation system and a user interface for structuring an unstructured text by making use of statistical models trained on annotated training data. The method performs text segmentation into text sections and assigns labels to text sections as section headings. The performed segmentation and assignment is provided to a user for general review. Additionally, alternative segmentations and label assignments are provided to the user being capable to select alternative segmentations and alternative labels as well as to enter a user defined segmentation and user defined label. In response to the modifications introduced by the user, a plurality of different actions are initiated incorporating the re-segmentation and re-labeling of successive parts of the document or the entire document.

Type: Application

Filed: August 15, 2011

Publication date: April 19, 2012

Applicant: Nuance Communications Austria GMBH

Inventors: Jochen PETERS, Evgeny MATUSOV, Carsten MEYER, Dietrich KLAKOW
Topic specific models for text formatting and speech recognition

Patent number: 8041566

Abstract: The present invention relates to a method, a computer system and a computer program product for speech recognition and/or text formatting by making use of topic specific statistical models. A text document which may be obtained from a first speech recognition pass is subject to segmentation and to an assignment of topic specific models for each obtained section. Each model of the set of models provides statistic information about language model probabilities, about text processing or formatting rules, as e.g. the interpretation of commands for punctuation, formatting, text highlighting or of ambiguous text portions requiring specific formatting, as well as a specific vocabulary being characteristic for each section of the recognized text. Furthermore, other properties of a speech recognition and/or formatting system (such as e.g. settings for the speaking rate) may be encoded in the statistical models. The models themselves are generated on the basis of annotated training data and/or by manual coding.

Type: Grant

Filed: November 12, 2004

Date of Patent: October 18, 2011

Assignee: Nuance Communications Austria GmbH

Inventors: Jochen Peters, Evgeny Matusov, Carsten Meyer, Dietrich Klakow
Text Segmentation and Label Assignment with User Interaction by Means of Topic Specific Language Models and Topic-Specific Label Statistics

Publication number: 20080201130

Abstract: The invention relates to a method, a computer program product, a segmentation system and a user interface for structuring an unstructured text by making use of statistical models trained on annotated training data. The method performs text segmentation into text sections and assigns labels to text sections as section headings. The performed segmentation and assignment is provided to a user for general review. Additionally, alternative segmentations and label assignments are provided to the user being capable to select alternative segmentations and alternative labels as well as to enter a user defined segmentation and user defined label. In response to the modifications introduced by the user, a plurality of different actions are initiated incorporating the re-segmentation and re-labelling of successive parts of the document or the entire document.

Type: Application

Filed: November 12, 2004

Publication date: August 21, 2008

Applicant: KONINKLIJKE PHILIPS ELECTRONIC, N.V.

Inventors: Jochen Peters, Evgeny Matusov, Carsten Meyer, Dietrich Klakow
Automatic Text Correction

Publication number: 20070299664

Abstract: The present invention provides a method of generating text transformation rules for speech to text transcription systems. The text transformation rules are generated by means of comparing an erroneous text generated by a speech to text transcription system with a correct reference text. Comparison of erroneous and reference text allows to derive a set of text transformation rules that are evaluated by means of a strict application to the training text and successive comparison with the reference text. Evaluation of text transformation rules provides a sufficient approach to determine which of the automatically generated text transformation rules provide an enhancement or degradation of the erroneous text. In this way only those text transformation rules of the set of text transformation rules are selected that guarantee an enhancement of the erroneous text. In this way systematic errors of an automatic speech recognition or natural language process system can be effectively compensated.

Type: Application

Filed: September 28, 2005

Publication date: December 27, 2007

Applicant: KONINKLIJKE PHILIPS ELECTRONICS, N.V.

Inventors: Jochen Peters, Evgeny Matusov
TOPIC SPECIFIC MODELS FOR TEXT FORMATTING AND SPEECH RECOGNITION

Publication number: 20070271086

Abstract: The present invention relates to a method, a computer system and a computer program product for speech recognition and/or text formatting by making use of topic specific statistical models. A text document which may be obtained from a first speech recognition pass is subject to segmentation and to an assignment of topic specific models for each obtained section. Each model of the set of models provides statistic information about language model probabilities, about text processing or formatting rules, as e.g. the interpretation of commands for punctuation, formatting, text highlighting or of ambiguous text portions requiring specific formatting, as well as a specific vocabulary being characteristic for each section of the recognized text. Furthermore, other properties of a speech recognition and/or formatting system (such as e.g. settings for the speaking rate) may be encoded in the statistical models. The models themselves are generated on the basis of annotated training data and/or by manual coding.

Type: Application

Filed: November 12, 2004

Publication date: November 22, 2007

Applicant: KONINKLIJKE PHILIPS ELECTRONIC, N.V.

Inventors: Jochen PETERS, Evgeny MATUSOV, Carsten MEYER, Dietrich KLAKOW
Text Segmentation and Topic Annotation for Document Structuring

Publication number: 20070260564

Abstract: The invention relates to a method, a computer program product and a computer system for structuring an unstructured text by making use of statistical models trained on annotated training data. Each section of text in which the text is segmented is further assigned to a topic which is associated to a set of labels. The statistical models for the segmentation of the text and for the assignment of a topic and its associated labels to a section of text explicitly accounts for: correlations between a section of text and a topic, a topic transition between sections, a topic position within the document and a (topic-dependent) section length. Hence structural information of the training data is exploited in order to perform segmentation and annotation of unknown text.

Type: Application

Filed: November 12, 2004

Publication date: November 8, 2007

Applicant: Koninklike Philips Electronics N.V.

Inventors: Jochen Peters, Carsten Meyer, Dietrich Klakow, Evgeny Matusov

prev 1 2

TEXT SEGMENTATION AND LABEL ASSIGNMENT WITH USER INTERACTION BY MEANS OF TOPIC SPECIFIC LANGUAGE MODELS AND TOPIC-SPECIFIC LABEL STATISTICS

Topic specific models for text formatting and speech recognition

Text Segmentation and Label Assignment with User Interaction by Means of Topic Specific Language Models and Topic-Specific Label Statistics

Automatic Text Correction

TOPIC SPECIFIC MODELS FOR TEXT FORMATTING AND SPEECH RECOGNITION

Text Segmentation and Topic Annotation for Document Structuring