Patents by Inventor Detlef Koll

Detlef Koll has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8335688
    Abstract: A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.
    Type: Grant
    Filed: August 20, 2004
    Date of Patent: December 18, 2012
    Assignee: Multimodal Technologies, LLC
    Inventors: Girija Yegnanarayanan, Michael Finke, Juergen Fritsch, Detlef Koll, Monika Woszczyna
  • Publication number: 20120316871
    Abstract: An automatic speech recognition system includes an audio capture component, a speech recognition processing component, and a result processing component which are distributed among two or more logical devices and/or two or more physical devices. In particular, the audio capture component may be located on a different logical device and/or physical device from the result processing component. For example, the audio capture component may be on a computer connected to a microphone into which a user speaks, while the result processing component may be on a terminal server which receives speech recognition results from a speech recognition processing server.
    Type: Application
    Filed: June 8, 2012
    Publication date: December 13, 2012
    Inventors: Detlef Koll, Michael Finke
  • Publication number: 20120303365
    Abstract: Techniques are disclosed for automatically de-identifying spoken audio signals. In particular, techniques are disclosed for automatically removing personally identifying information from spoken audio signals and replacing such information with non-personally identifying information. De-identification of a spoken audio signal may be performed by automatically generating a report based on the spoken audio signal. The report may include concept content (e.g., text) corresponding to one or more concepts represented by the spoken audio signal. The report may also include timestamps indicating temporal positions of speech in the spoken audio signal that corresponds to the concept content. Concept content that represents personally identifying information is identified. Audio corresponding to the personally identifying concept content is removed from the spoken audio signal. The removed audio may be replaced with non-personally identifying audio.
    Type: Application
    Filed: November 23, 2011
    Publication date: November 29, 2012
    Inventors: Michael Finke, Detlef Koll
  • Patent number: 8321199
    Abstract: Facts are extracted from speech and recorded in a document using codings. Each coding represents an extracted fact and includes a code and a datum. The code may represent a type of the extracted fact and the datum may represent a value of the extracted fact. The datum in a coding is rendered based on a specified feature of the coding. For example, the datum may be rendered as boldface text to indicate that the coding has been designated as an “allergy.” In this way, the specified feature of the coding (e.g., “allergy”-ness) is used to modify the manner in which the datum is rendered. A user inspects the rendering and provides, based on the rendering, an indication of whether the coding was accurately designated as having the specified feature. A record of the user's indication may be stored, such as within the coding itself.
    Type: Grant
    Filed: April 30, 2010
    Date of Patent: November 27, 2012
    Assignee: Multimodal Technologies, LLC
    Inventors: Detlef Koll, Michael Finke
  • Publication number: 20120296644
    Abstract: A hybrid speech recognition system uses a client-side speech recognition engine and a server-side speech recognition engine to produce speech recognition results for the same speech. An arbitration engine produces speech recognition output based on one or both of the client-side and server-side speech recognition results.
    Type: Application
    Filed: August 1, 2012
    Publication date: November 22, 2012
    Inventor: Detlef Koll
  • Publication number: 20120296645
    Abstract: A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes the speech stream continuously. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition.
    Type: Application
    Filed: August 1, 2012
    Publication date: November 22, 2012
    Inventors: Eric Carraux, Detlef Koll
  • Publication number: 20120296639
    Abstract: Facts are extracted from speech and recorded in a document using codings. Each coding represents an extracted fact and includes a code and a datum. The code may represent a type of the extracted fact and the datum may represent a value of the extracted fact. The datum in a coding is rendered based on a specified feature of the coding. For example, the datum may be rendered as boldface text to indicate that the coding has been designated as an “allergy.” In this way, the specified feature of the coding (e.g., “allergy”-ness) is used to modify the manner in which the datum is rendered. A user inspects the rendering and provides, based on the rendering, an indication of whether the coding was accurately designated as having the specified feature. A record of the user's indication may be stored, such as within the coding itself.
    Type: Application
    Filed: August 1, 2012
    Publication date: November 22, 2012
    Inventors: Detlef Koll, Michael Finke
  • Patent number: 8249877
    Abstract: A hybrid speech recognition system uses a client-side speech recognition engine and a server-side speech recognition engine to produce speech recognition results for the same speech. An arbitration engine produces speech recognition output based on one or both of the client-side and server-side speech recognition results.
    Type: Grant
    Filed: September 24, 2010
    Date of Patent: August 21, 2012
    Assignee: Multimodal Technologies, LLC
    Inventor: Detlef Koll
  • Patent number: 8249878
    Abstract: A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes a first portion of the speech stream and, if a predetermined criterion is satisfied by the speech recognition result, waits until the speech recognizer has been reconfigured before recognizing a second portion of the speech stream. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition.
    Type: Grant
    Filed: August 2, 2011
    Date of Patent: August 21, 2012
    Assignee: Multimodal Technologies, LLC
    Inventors: Eric Carraux, Detlef Koll
  • Publication number: 20120089629
    Abstract: A system includes a document corpus containing structured documents, which contain both text and annotations of the text. The system also includes a search engine which is adapted to perform structured searches of the structured documents. As new types of annotations are added to the system, the search engine is updated automatically to become capable of performing structured searches for the new types of annotations. For example, if a new natural language processing (NLP) component, adapted to generate annotations of a new type, is added to the system, then the system automatically updates a query language to include a definition of the new type of annotation. The search engine may then immediately be capable of processing structured queries which refer to the new type of annotation.
    Type: Application
    Filed: October 8, 2011
    Publication date: April 12, 2012
    Inventors: Detlef Koll, Juergen Fritsch
  • Publication number: 20120078763
    Abstract: A system applies rules to a set of documents to generate codes, such as billing codes for use in medical billing. A human operator provides input specifying whether the generated codes are correct. Based on the input from the human operator, the system attempts to identify which clause(s) in the rules which were relied on to generate the particular code are correct and which such clause(s) are incorrect. The system then assigns praise to components of the system responsible for generating codes in the correct clauses, and assigns blame to components of the system responsible for generating codes in the incorrect clauses. Such blame and praise may then be used to determine whether particular code-generating components are insufficiently reliable. The system may disable, or take other remedial action in response to, insufficiently reliable code-generating components.
    Type: Application
    Filed: September 23, 2011
    Publication date: March 29, 2012
    Inventors: Detlef Koll, Thomas Polzin
  • Publication number: 20120041950
    Abstract: A computer-based system includes a computer-processable definition of a region in a data set. The system identifies a region of the data set based on the definition of the region. The system provides output to a user representing a question and the identified region of the data set. The system may also automatically generate an answer to the question based on the question and the data set, and provide output to the user representing the answer. The system may generate the answer based on a subset of the data set, and provide output to the user representing the subset of the data set. The user may provide feedback on the first answer to the system, which the system may use to improve subsequent answers to the same and other questions, and to disable the system's automatic question-answering function in response to disagreement between the user and the system.
    Type: Application
    Filed: February 10, 2011
    Publication date: February 16, 2012
    Inventors: Detlef Koll, Thomas Polzin
  • Patent number: 8086458
    Abstract: Techniques are disclosed for automatically de-identifying spoken audio signals. In particular, techniques are disclosed for automatically removing personally identifying information from spoken audio signals and replacing such information with non-personally identifying information. De-identification of a spoken audio signal may be performed by automatically generating a report based on the spoken audio signal. The report may include concept content (e.g., text) corresponding to one or more concepts represented by the spoken audio signal. The report may also include timestamps indicating temporal positions of speech in the spoken audio signal that corresponds to the concept content. Concept content that represents personally identifying information is identified. Audio corresponding to the personally identifying concept content is removed from the spoken audio signal. The removed audio may be replaced with non-personally identifying audio.
    Type: Grant
    Filed: October 24, 2008
    Date of Patent: December 27, 2011
    Assignee: Multimodal Technologies, LLC
    Inventors: Michael Finke, Detlef Koll
  • Publication number: 20110288857
    Abstract: A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes the speech stream continuously. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition.
    Type: Application
    Filed: August 2, 2011
    Publication date: November 24, 2011
    Inventors: Eric Carraux, Detlef Koll
  • Publication number: 20110289405
    Abstract: A human editor uses a document editing system to edit a draft document. The editor's editing behavior is monitored and logged. Statistics are developed from the log to produce an assessment of the editor's productivity. This assessment, in combination with assessments of other editors, may be used to develop behavioral metrics which indicate correlations between editing behaviors and productivity. The behavioral metrics may be used to identify including the relative contribution to efficient editing of different editing behaviors. Such information about individual editing behaviors may be used to evaluate the productivity of individual editors based on their editing behaviors, to identify behaviors which individual editors could adopt to improve their productivities, and to identify changes to the editing system itself for improving editor productivity. An editor's editing behavior may be “played back” and observed by a human in an attempt to identify the causes of the editor's poor productivity.
    Type: Application
    Filed: August 2, 2011
    Publication date: November 24, 2011
    Inventors: Juergen Fritsch, Detlef Koll, Kjell Schubert, Christopher M. Currivan
  • Publication number: 20110282687
    Abstract: An automated system updates electronic medical records (EMRs) based on dictated reports, without requiring manual data entry into on-screen forms. A dictated report is transcribed by an automatic speech recognizer, and facts are extracted from the report and stored in encoded form. Information from a patient's report is also stored in encoded form. The resulting encoded information from the report and EMR are reconciled with each other, and changes to be made to the EMR are identified based on the reconciliation. The identified changes are made to the EMR automatically, without requiring manual data entry into the EMR.
    Type: Application
    Filed: February 28, 2011
    Publication date: November 17, 2011
    Inventor: Detlef Koll
  • Publication number: 20110238415
    Abstract: A hybrid speech recognition system uses a client-side speech recognition engine and a server-side speech recognition engine to produce speech recognition results for the same speech. An arbitration engine produces speech recognition output based on one or both of the client-side and server-side speech recognition results.
    Type: Application
    Filed: September 24, 2010
    Publication date: September 29, 2011
    Inventor: Detlef Koll
  • Patent number: 8019608
    Abstract: A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes the speech stream continuously. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition if a first speech recognition result satisfies a predetermined criterion specified by the control stream.
    Type: Grant
    Filed: August 30, 2009
    Date of Patent: September 13, 2011
    Assignee: Multimodal Technologies, Inc.
    Inventors: Eric Carraux, Detlef Koll
  • Patent number: 7933777
    Abstract: A hybrid speech recognition system uses a client-side speech recognition engine and a server-side speech recognition engine to produce speech recognition results for the same speech. An arbitration engine produces speech recognition output based on one or both of the client-side and server-side speech recognition results.
    Type: Grant
    Filed: August 30, 2009
    Date of Patent: April 26, 2011
    Assignee: Multimodal Technologies, Inc.
    Inventor: Detlef Koll
  • Patent number: 7869996
    Abstract: A speech processing system divides a spoken audio stream into partial audio streams, referred to as “snippets.” The system may divide a portion of the audio stream into two snippets at a position at which the speaker performed an editing operation, such as pausing and then resuming recording, or rewinding and then resuming recording. The snippets may be transmitted sequentially to a consumer, such as an automatic speech recognizer or a playback device, as the snippets are generated. The consumer may process (e.g., recognize or play back) the snippets as they are received. The consumer may modify its output in response to editing operations reflected in the snippets. The consumer may process the audio stream while it is being created and transmitted even if the audio stream includes editing operations that invalidate previously-transmitted partial audio streams, thereby enabling shorter turnaround time between dictation and consumption of the complete audio stream.
    Type: Grant
    Filed: November 23, 2007
    Date of Patent: January 11, 2011
    Assignee: Multimodal Technologies, Inc.
    Inventors: Eric Carraux, Detlef Koll