Patents by Inventor Siegfried Kunzmann
Siegfried Kunzmann has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11915690Abstract: A multi-channel transformer acoustic model that processes a plurality of audio signals output by microphones of a microphone array and outputs probabilities for acoustic units of an utterance represented in the audio signals. The audio signals represent the individual microphones' respective capturing of the utterance. The multi-channel model may perform self-attention on embeddings of the audio signals and then cross-channel attention across the attended audio signals. The cross-channel attention may involve processing of signals relative to each other to model the relationships across channels within and across time frames. The multi-channel model may include a transducer to perform processing frame-by-frame.Type: GrantFiled: September 29, 2021Date of Patent: February 27, 2024Assignee: Amazon Technologies, Inc.Inventors: Feng-Ju Chang, Martin Radfar, Athanasios Mouchtaris, Brian King, Siegfried Kunzmann, Maurizio Omologo
-
Patent number: 8412528Abstract: The present invention relates to computer-generated text-to-speech conversion. It relates in particular to a method and system for updating a Concatenative Text-To-Speech (CTTS) system with a speech database from a base version to a new version. The present invention performs an application-specific re-organization of a synthesizer's speech database by means of certain decision tree modifications. By that reorganization, certain synthesis units are made available for the new application, which are not available in prior art without a new speech session. This allows the creation of application-specific synthesizers with improved output speech quality for arbitrary domains and applications at very low cost.Type: GrantFiled: May 2, 2006Date of Patent: April 2, 2013Assignee: Nuance Communications, Inc.Inventors: Volker Fischer, Siegfried Kunzmann
-
Patent number: 7302393Abstract: A method and respective system for operating a speech recognition system, in which a plurality of recognizer programs are accessible to be activated for speech recognition, and are combined on a per need basis in order to efficiently improve the results of speech recognition done by a single recognizer. In order to adapt such system to the dynamically changing acoustic conditions of various operating environments and to the particular requirements of running in embedded systems having only a limited computing power available, it is proposed to a) collect selection base data characterizing speech recognition boundary conditions, e.g. the speaker person and the environmental noise, etc., with sensor means, and b) using program-controlled arbiter means for evaluating the collected data, e.g., a decision engine including software mechanism and a physical sensor, to select the best suited recognizer or a combination thereof out of the plurality of available recognizers.Type: GrantFiled: October 31, 2003Date of Patent: November 27, 2007Assignee: International Business Machines CorporationInventors: Volker Fischer, Siegfried Kunzmann
-
Patent number: 7213151Abstract: The present invention relates to a computer system and to a method for encoding of information into a representation comprising a plurality of segments, the order of the segments in the representation being irrelevant for a rendering of the representation, the method comprising the steps of: identification of the segments, permutation of the segments to encode the information.Type: GrantFiled: June 27, 2002Date of Patent: May 1, 2007Assignee: International Business Machines CorporationInventors: Carsten Guenther, Werner Kriechbaum, Siegfried Kunzmann, Bernhard Hubert Zeller
-
Publication number: 20060287861Abstract: The present invention relates to computer-generated text-to-speech conversion. It relates in particular to a method and system for updating a Concatenative Text-To-Speech (CTTS) system with a speech database from a base version to a new version. The present invention performs an application-specific re-organization of a synthesizer's speech database by means of certain decision tree modifications. By that reorganization, certain synthesis units are made available for the new application, which are not available in prior art without a new speech session. This allows the creation of application-specific synthesizers with improved output speech quality for arbitrary domains and applications at very low cost.Type: ApplicationFiled: May 2, 2006Publication date: December 21, 2006Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Volker Fischer, Siegfried Kunzmann
-
Publication number: 20060173684Abstract: A method and respective system for operating a speech recognition system, in which a plurality of recognizer programs are accessible to be activated for speech recognition, and are combined on a per need basis in order to efficiently improve the results of speech recognition done by a single recognizer. In order to adapt such system to the dynamically changing acoustic conditions of various operating environments and to the particular requirements of running in embedded systems having only a limited computing power available, it is proposed to a) collect selection base data characterizing speech recognition boundary conditions, e.g. the speaker person and the environmental noise, etc., with sensor means, and b) using program-controlled arbiter means for evaluating the collected data, e.g., a decision engine including software mechanism and a physical sensor, to select the best suited recognizer or a combination thereof out of the plurality of available recognizers.Type: ApplicationFiled: October 31, 2003Publication date: August 3, 2006Applicant: International Business Machines CorporationInventors: Volker Fischer, Siegfried Kunzmann
-
Patent number: 6999925Abstract: The present invention provides a computerized method and apparatus for automatically generating from a first speech recognizer a second speech recognizer which can be adapted to a specific domain. The first speech recognizer can include a first acoustic model with a first decision network and corresponding first phonetic contexts. The first acoustic model can be used as a starting point for the adaptation process. A second acoustic model with a second decision network and corresponding second phonetic contexts for the second speech recognizer can be generated by re-estimating the first decision network and the corresponding first phonetic contexts based on domain-specific training data.Type: GrantFiled: November 13, 2001Date of Patent: February 14, 2006Assignee: International Business Machines CorporationInventors: Volker Fischer, Siegfried Kunzmann, Eric-W. Janke, A. Jon Tyrrell
-
Patent number: 6789061Abstract: Computer-based methods and systems are provided for automatically generating, from a first speech recognizer, a second speech recognizer such that the second speech recognizer is tailored to a certain application and requires reduced resources compared to the first speech recognizer. The invention exploits the first speech recognizer's set of states si and set of probability density functions (pdfs) assembling output probabilities for an observation of a speech frame in said states si. The invention teaches a first step of generating a set of states of the second speech recognizer reduced to a subset of states of the first speech recognizer being distinctive of the certain application. The invention teaches a second step of generating a set of probability density functions of the second speech recognizer reduced to a subset of probability density functions of the first speech recognizer being distinctive of the certain application.Type: GrantFiled: August 14, 2000Date of Patent: September 7, 2004Assignee: International Business Machines CorporationInventors: Volker Fischer, Siegfried Kunzmann, Claire Waast-Ricard
-
Patent number: 6738741Abstract: A speech recognition system and a method executed by a speech recognition system focusing on the vocabulary of the speech recognition system and its usage during the speech recognition process is provided. A segmented vocabulary and its exploitation is provided comprising a multitude of entries wherein an entry is either identical to a legal word or a constituent of a legal word of the language, and the constituent is an arbitrary sub-component of the legal word according to the orthography. A constituent can comprise any number of characters not limited to a syllable of a legal word or a recognition unit of the speech recognition system. The vocabulary is used to recognize constituents of the vocabulary for recombination of the constituents into legal words if a constituent combination table indicates that the recognized constituents are a legal concatenation in the language.Type: GrantFiled: November 18, 2002Date of Patent: May 18, 2004Assignee: International Business Machines CorporationInventors: Ossama Emam, Siegfried Kunzmann
-
Publication number: 20030078778Abstract: A speech recognition system and a method executed by a speech recognition system focusing on the vocabulary of the speech recognition system and its usage during the speech recognition process is provided. A segmented vocabulary and its exploitation is provided comprising a multitude of entries wherein an entry is either identical to a legal word or a constituent of a legal word of the language, and the constituent is an arbitrary sub-component of the legal word according to the orthography. A constituent can comprise any number of characters not limited to a syllable of a legal word or a recognition unit of the speech recognition system. The vocabulary is used to recognize constituents of the vocabulary for recombination of the constituents into legal words if a constituent combination table indicates that the recognized constituents are a legal concatenation in the language.Type: ApplicationFiled: November 18, 2002Publication date: April 24, 2003Applicant: International Business Machines CorporationInventors: Ossama Emam, Siegfried Kunzmann
-
Publication number: 20030074561Abstract: The present invention relates to a computer system and to a method for encoding of information into a representation comprising a plurality of segments, the order of the segments in the representation being irrelevant for a rendering of the representation, the method comprising the steps of:Type: ApplicationFiled: June 27, 2002Publication date: April 17, 2003Applicant: International Business Machines CorporationInventors: Carsten Guenther, Werner Kriechbaum, Siegfried Kunzmann, Bernhard Hubert Zeller
-
Publication number: 20020168089Abstract: Disclosed are a method, apparatus, and program for providing authentication of a rendered multimedia realization. A renderer and a watermark generator are integrated wherein the renderer receives a symbolic stream, e.g. in the case of a text-to-speech system a text, and generates a realization, e.g. an audio signal representing a spoken version of the text. An identification is embedded into the signal by the watermark generator using standard steganographic methods. Such a serial integration of renderer and watermark generator is applicable to all known renderers and watermarking techniques. The mechanism enables inheritance of originality of the original representation or realization to the rendered realization.Type: ApplicationFiled: May 9, 2002Publication date: November 14, 2002Applicant: International Business Machines CorporationInventors: Carsten Guenther, Werner Kriechbaum, Siegfried Kunzmann, Bernhard Hubert Zeller
-
Publication number: 20020099543Abstract: A speech recognition system and a method executed by a speech recognition system focusing on the vocabulary of the speech recognition system and its usage during the speech recognition process is provided. A segmented vocabulary and its exploitation is provided comprising a multitude of entries wherein an entry is either identical to a legal word or a constituent of a legal word of the language, and the constituent is an arbitrary sub-component of the legal word according to the orthography. A constituent can comprise any number of characters not limited to a syllable of a legal word or a recognition unit of the speech recognition system. The vocabulary is used to recognize constituents of the vocabulary for recombination of the constituents into legal words if a constituent combination table indicates that the recognized constituents are a legal concatenation in the language.Type: ApplicationFiled: August 25, 1999Publication date: July 25, 2002Inventors: OSSAMA EMAN, SIEGFRIED KUNZMANN
-
Publication number: 20020087314Abstract: The present invention provides a computerized method and apparatus for automatically generating from a first speech recognizer a second speech recognizer which can be adapted to a specific domain. The first speech recognizer can include a first acoustic model with a first decision network and corresponding first phonetic contexts. The first acoustic model can be used as a starting point for the adaptation process. A second acoustic model with a second decision network and corresponding second phonetic contexts for the second speech recognizer can be generated by re-estimating the first decision network and the corresponding first phonetic contexts based on domain-specific training data.Type: ApplicationFiled: November 13, 2001Publication date: July 4, 2002Applicant: International Business Machines CorporationInventors: Volker Fischer, Siegfried Kunzmann, Eric-W. Janke, A. Jon Tyrrell
-
Patent number: 5899973Abstract: In this speech recognition system, the size of the language model is reduced by discarding those n-grams that the acoustic part of the system can recognize most accurately without support from a language model. The n-grams can be discarded dynamically during the running of the system or during the build or setup-time of the system. Trigrams occurring infrequently in the text corpora are substituted for the discarded n-grams to increase the accuracy of the word recognitions.Type: GrantFiled: September 25, 1997Date of Patent: May 4, 1999Assignee: International Business Machines CorporationInventors: Upali Bandara, Siegfried Kunzmann, Karlheinz Mohr, Burn L. Lewis