Patents Assigned to Multimodal Technologies, Inc.

Distributed speech recognition using one way communication

Patent number: 8019608

Abstract: A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes the speech stream continuously. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition if a first speech recognition result satisfies a predetermined criterion specified by the control stream.

Type: Grant

Filed: August 30, 2009

Date of Patent: September 13, 2011

Assignee: Multimodal Technologies, Inc.

Inventors: Eric Carraux, Detlef Koll
Hybrid speech recognition

Patent number: 7933777

Abstract: A hybrid speech recognition system uses a client-side speech recognition engine and a server-side speech recognition engine to produce speech recognition results for the same speech. An arbitration engine produces speech recognition output based on one or both of the client-side and server-side speech recognition results.

Type: Grant

Filed: August 30, 2009

Date of Patent: April 26, 2011

Assignee: Multimodal Technologies, Inc.

Inventor: Detlef Koll
Recognition of speech in editable audio streams

Patent number: 7869996

Abstract: A speech processing system divides a spoken audio stream into partial audio streams, referred to as “snippets.” The system may divide a portion of the audio stream into two snippets at a position at which the speaker performed an editing operation, such as pausing and then resuming recording, or rewinding and then resuming recording. The snippets may be transmitted sequentially to a consumer, such as an automatic speech recognizer or a playback device, as the snippets are generated. The consumer may process (e.g., recognize or play back) the snippets as they are received. The consumer may modify its output in response to editing operations reflected in the snippets. The consumer may process the audio stream while it is being created and transmitted even if the audio stream includes editing operations that invalidate previously-transmitted partial audio streams, thereby enabling shorter turnaround time between dictation and consumption of the complete audio stream.

Type: Grant

Filed: November 23, 2007

Date of Patent: January 11, 2011

Assignee: Multimodal Technologies, Inc.

Inventors: Eric Carraux, Detlef Koll
Content-based audio playback emphasis

Patent number: 7844464

Abstract: Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example, playing them back more slowly than regions that are of low relevance and likely to have been transcribed correctly. Emphasizing those regions of the audio stream that are most important to transcribe correctly and those regions that are most likely to have been transcribed incorrectly increases the likelihood that the proofreader will accurately correct any errors in those regions, thereby improving the overall accuracy of the transcript.

Type: Grant

Filed: July 22, 2005

Date of Patent: November 30, 2010

Assignee: Multimodal Technologies, Inc.

Inventors: Kjell Schubert, Juergen Fritsch, Michael Finke, Detlef Koll
Replacing text representing a concept with an alternate written form of the concept

Patent number: 7831423

Abstract: A system enables a transcriptionist to replace a first written form (such as an abbreviation) of a concept with a second written form (such as an expanded form) of the same concept. For example, the system may display to the transcriptionist a draft document produced from speech by an automatic speech recognizer. If the transcriptionist recognizes a first written form of a concept that should be replaced with a second written form of the same concept, the transcriptionist may provide the system with a replacement command. In response, the system may identify the second written form of the concept and replace the first written form with the second written form in the draft document.

Type: Grant

Filed: May 25, 2006

Date of Patent: November 9, 2010

Assignee: Multimodal Technologies, Inc.

Inventor: Kjell Schubert
Verification of extracted data

Patent number: 7716040

Abstract: Facts are extracted from speech and recorded in a document using codings. Each coding represents an extracted fact and includes a code and a datum. The code may represent a type of the extracted fact and the datum may represent a value of the extracted fact. The datum in a coding is rendered based on a specified feature of the coding. For example, the datum may be rendered as boldface text to indicate that the coding has been designated as an “allergy.” In this way, the specified feature of the coding (e.g., “allergy”-ness) is used to modify the manner in which the datum is rendered. A user inspects the rendering and provides, based on the rendering, an indication of whether the coding was accurately designated as having the specified feature. A record of the user's indication may be stored, such as within the coding itself.

Type: Grant

Filed: June 21, 2007

Date of Patent: May 11, 2010

Assignee: Multimodal Technologies, Inc.

Inventors: Detlef Koll, Michael Finke
Automatic detection and application of editing patterns in draft documents

Patent number: 7640158

Abstract: An error detection and correction system extracts editing patterns and derives correction rules from them by observing differences between draft documents and corresponding edited documents, and/or by observing editing operations performed on the draft documents to produce the edited documents. The system develops classifiers that partition the space of all possible contexts into equivalence classes and assigns one or more correction rules to each such class). Once the system has been trained, it may be used to detect and (optionally) correct errors in new draft documents. When presented with a draft document, the system identifies first content (e.g., text) in the draft document and identifies a context of the first content. The system identifies a correction rule based on the first content and the first context. The system may use a classifier to identify the correction rule. The system applies the correction rule to the first content to produce second content.

Type: Grant

Filed: November 8, 2005

Date of Patent: December 29, 2009

Assignee: Multimodal Technologies, Inc.

Inventors: Koll Detlef, Juergen Fritsch, Michael Finke
Automated extraction of semantic content and generation of a structured document from speech

Patent number: 7584103

Abstract: Techniques are disclosed for automatically generating structured documents based on speech, including identification of relevant concepts and their interpretation. In one embodiment, a structured document generator uses an integrated process to generate a structured textual document (such as a structured textual medical report) based on a spoken audio stream. The spoken audio stream may be recognized using a language model which includes a plurality of sub-models arranged in a hierarchical structure. Each of the sub-models may correspond to a concept that is expected to appear in the spoken audio stream. Different portions of the spoken audio stream may be recognized using different sub-models. The resulting structured textual document may have a hierarchical structure that corresponds to the hierarchical structure of the language sub-models that were used to generate the structured textual document.

Type: Grant

Filed: August 20, 2004

Date of Patent: September 1, 2009

Assignee: Multimodal Technologies, Inc.

Inventors: Juergen Fritsch, Michael Finke, Detlef Koll, Monika Woszczyna, Girija Yegnanarayanan
DOCUMENT EDITING USING ANCHORS

Publication number: 20090113293

Abstract: A user edits text in a draft document by providing input including left and right “anchor” text and replacement text. In response, a document editing system identifies an instance of the left anchor text followed by the right anchor text in the draft document, and replaces text between these instances with the replacement text specified by the user. For example, the user may type a string containing the left anchor text followed by the replacement text followed by the right anchor text, in response to which the system may perform the replacement just described. As a result, the user may specify both the location of, and a correction for, text in the draft document without using cursor keys or other navigation commands to navigate to the location of the text to be corrected, thereby increasing correction efficiency by avoiding the delay associated with such manual navigation.

Type: Application

Filed: August 19, 2007

Publication date: April 30, 2009

Applicant: MULTIMODAL TECHNOLOGIES, INC.

Inventor: Kjell Schubert
Audio signal de-identification

Patent number: 7502741

Abstract: Techniques are disclosed for automatically de-identifying spoken audio signals. In particular, techniques are disclosed for automatically removing personally identifying information from spoken audio signals and replacing such information with non-personally identifying information. De-identification of a spoken audio signal may be performed by automatically generating a report based on the spoken audio signal. The report may include concept content (e.g., text) corresponding to one or more concepts represented by the spoken audio signal. The report may also include timestamps indicating temporal positions of speech in the spoken audio signal that corresponds to the concept content. Concept content that represents personally identifying information is identified. Audio corresponding to the personally identifying concept content is removed from the spoken audio signal. The removed audio may be replaced with non-personally identifying audio.

Type: Grant

Filed: February 23, 2005

Date of Patent: March 10, 2009

Assignee: Multimodal Technologies, Inc.

Inventors: Michael Finke, Detlef Koll
Attribute-based word modeling

Patent number: 6963837

Abstract: An attribute-based speech recognition system is described. A speech pre-processor receives input speech and produces a sequence of acoustic observations representative of the input speech. A database of context-dependent acoustic models characterize a probability of a given sequence of sounds producing the sequence of acoustic observations. Each acoustic model includes phonetic attributes and suprasegmental non-phonetic attributes. A finite state language model characterizes a probability of a given sequence of words being spoken. A one-pass decoder compares the sequence of acoustic observations to the acoustic models and the language model, and outputs at least one word sequence representative of the input speech.

Type: Grant

Filed: October 6, 2000

Date of Patent: November 8, 2005

Assignee: Multimodal Technologies, Inc.

Inventors: Michael Finke, Jurgen Fritsch, Detleff Koll, Alex Waibel