Speech To Text Systems (epo) Patents (Class 704/E15.043)

E Subclasses

Speech recognition depending on application context, e.g., in a computer, etc. (epo) (Class 704/E15.044)

Systems using speech recognizers (epo) (Class 704/E15.045)

TRANSLATING LANGUAGES

Publication number: 20120035907

Abstract: A method, performed on a server, of translating between languages includes receiving first audio data for a first language from a mobile device, translating the first audio data to second audio data for a second language, receiving an indication that the mobile device has moved between two locations, and sending the second audio data to the mobile device in response to the indication.

Type: Application

Filed: August 5, 2010

Publication date: February 9, 2012

Inventors: Michael J. Lebeau, John Nicholas Jitkoff
IN-VEHICLE TEXT MESSAGING EXPERIENCE ENGINE

Publication number: 20120035923

Abstract: The disclosed invention provides a system and apparatus for providing a telematics system user with an improved texting experience. A messaging experience engine database enables voice avatar/personality selection, acronym conversion, shorthand conversion, and custom audio and video mapping. As an interpreter of the messaging content that is passed through the telematics system, the system eliminates the need for a user to manually manipulate a texting device, or to read such a device. The system recognizes functional content and executes actions based on the identified functional content.

Type: Application

Filed: August 9, 2010

Publication date: February 9, 2012

Applicant: General Motors LLC

Inventor: Kevin R. Krause
SYSTEMS AND METHODS FOR RECORDING, SEARCHING, AND SHARING SPOKEN CONTENT IN MEDIA FILES

Publication number: 20120029918

Abstract: Systems for recording, searching for, and sharing media files among a plurality of users are disclosed. The systems include a server that is configured to receive, index, and store a plurality of media files, which are received by the server from a plurality of sources, within at least one database in communication with the server. In addition, the server is configured to make one or more of the media files accessible to one or more persons—other than the original sources of such media files. Still further, the server is configured to transcribe the media files into text; receive and publish comments associated with the media files within a graphical user interface of a website; and allow users to query and playback excerpted portions of such media files.

Type: Application

Filed: October 11, 2011

Publication date: February 2, 2012

Inventor: Walter Bachtiger
APPARATUS AND METHOD FOR PROVIDING MESSAGES IN A SOCIAL NETWORK

Publication number: 20120029917

Abstract: A system that incorporates teachings of the present disclosure may include, for example, a server including a controller to receive audio signals and content identification information from a media processor, generate text representing a voice message based on the audio signals, determine an identity of media content based on the content identification information, generate an enhanced message having text and additional content where the additional content is obtained by the controller based on the identity of the media content, and transmit the enhanced message to the media processor for presentation on the display device, where the enhanced message is accessible by one or more communication devices that are associated with a social network and remote from the media processor. Other embodiments are disclosed.

Type: Application

Filed: August 2, 2010

Publication date: February 2, 2012

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: HISAO CHANG, BERNARD S. RENGER
Word-Level Correction of Speech Input

Publication number: 20120022868

Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.

Type: Application

Filed: September 30, 2011

Publication date: January 26, 2012

Inventors: Michael J. LeBeau, William J. Byrne, John Nicholas Jitkoff, Brandon M. Ballinger, Trausti Kristjansson
System and Method for Efficiently Reducing Transcription Error Using Hybrid Voice Transcription

Publication number: 20120022865

Abstract: A system and method for efficiently reducing transcription error using hybrid voice transcription is provided. A voice stream is parsed from a call into utterances. An initial transcribed value and corresponding recognition score are assigned to each utterance. A transcribed message is generated for the call and includes the initial transcribed values. A threshold is applied to the recognition scores to identify those utterances with recognition scores below the threshold as questionable utterances. At least one questionable utterance is compared to other questionable utterances from other calls and a group of similar questionable utterances is formed. One or more of the similar questionable utterances is selected from the group. A common manual transcription value is received for the selected similar questionable utterances. The common manual transcription value is assigned to the remaining similar questionable utterances in the group.

Type: Application

Filed: July 20, 2010

Publication date: January 26, 2012

Inventor: David Milstein
Speech to Text Conversion

Publication number: 20120022867

Abstract: Methods, computer program products and systems are described for speech-to-text conversion. A voice input is received from a user of an electronic device and contextual metadata is received that describes a context of the electronic device at a time when the voice input is received. Multiple base language models are identified, where each base language model corresponds to a distinct textual corpus of content. Using the contextual metadata, an interpolated language model is generated based on contributions from the base language models. The contributions are weighted according to a weighting for each of the base language models. The interpolated language model is used to convert the received voice input to a textual output. The voice input is received at a computer server system that is remote to the electronic device. The textual output is transmitted to the electronic device.

Type: Application

Filed: September 29, 2011

Publication date: January 26, 2012

Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen, Michael D. Riley
Multi-Modal Input on an Electronic Device

Publication number: 20120022853

Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.

Type: Application

Filed: September 29, 2011

Publication date: January 26, 2012

Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau
Language Model Selection for Speech-to-Text Conversion

Publication number: 20120022866

Abstract: Methods, computer program products and systems are described for converting speech to text. Sound information is received at a computer server system from an electronic device, where the sound information is from a user of the electronic device. A context identifier indicates a context within which the user provided the sound information. The context identifier is used to select, from among multiple language models, a language model appropriate for the context. Speech in the sound information is converted to text using the selected language model. The text is provided for use by the electronic device.

Type: Application

Filed: September 29, 2011

Publication date: January 26, 2012

Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen
Tool and method for enhanced human machine collaboration for rapid and accurate transcriptions

Publication number: 20120016671

Abstract: A system and methods for transcribing text from audio and video files including a set of transcription hosts and an automatic speech recognition system. ASR word-lattices are dynamically selected from either a text box or word-lattice graph wherein the most probable text sequences are presented to the transcriptionist. Secure transcriptions may be accomplished by segmenting a digital audio file into a set of audio slices for transcription by a plurality of transcriptionist. No one transcriptionist is aware of the final transcribed text, only small portions of transcribed text. Secure and high quality transcriptions may be accomplished by segmenting a digital audio file into a set of audio slices, sending them serially to a set of transcriptionists and updating the acoustic and language models at each step to improve the word-lattice accuracy.

Type: Application

Filed: July 15, 2010

Publication date: January 19, 2012

Inventors: Pawan Jaggi, Abhijeet Sangwan
VOICE INTEGRATION PLATFORM

Publication number: 20120010876

Abstract: A voice integration platform and method provide for integration of a voice interface with a data system that includes stored data. The voice integration platform comprises one or more generic software components, the generic software components being configured to enable development of a specific voice user interface that is designed to interact with the data system in order to present the stored data to a user.

Type: Application

Filed: September 22, 2011

Publication date: January 12, 2012

Applicant: Ben Franklin Patent Holding LLC

Inventors: Andrew G. Smolenski, Steven Markman, Pericles Haleftiras, Jon Thomas Layton, Lizanne Kaiser, Gregory S. Kluthe, Michael W. Achenbach
TRANSCRIPTION DATA EXTRACTION

Publication number: 20120010883

Abstract: A computer program product, for performing data determination from medical record transcriptions, resides on a computer-readable medium and includes computer-readable instructions for causing a computer to obtain a medical transcription of a dictation, the dictation being from medical personnel and concerning a patient, analyze the transcription for an indicating phrase associated with a type of data desired to be determined from the transcription, the type of desired data being relevant to medical records, determine whether data indicated by text disposed proximately to the indicating phrase is of the desired type, and store an indication of the data if the data is of the desired type.

Type: Application

Filed: February 8, 2011

Publication date: January 12, 2012

Applicant: eScription, Inc.

Inventors: Roger S. Zimmerman, Paul Egerman, George Zavaliagkos
System and method for speech processing and speech to text

Publication number: 20120004910

Abstract: Systems and method for processing speech from a user is disclosed. In the system of the present invention, the user's speech is received as input audio stream. The input audio stream is converted text that corresponds to the input audio stream. The converted text is converted to an echo audio stream. Then, the echo audio stream is sent to the user. This process is performed in real time. Accordingly, the user is able to determine whether or not the speech to text process was correct, or that his or her speech was corrected converted to text. If the conversion was incorrect, the user is able to correct the conversion process by using editing commands. The corresponding text is then analyzed to determine the operation which it demands. Then, the operation is performed on the corresponding text.

Type: Application

Filed: November 24, 2009

Publication date: January 5, 2012

Inventors: Romulo De Guzman Quidilig, Kenneth Nakagawa, Michiyo Manning
Method and Apparatus for Identifying Video Program Material or Content via Nonlinear Transformations

Publication number: 20120004911

Abstract: A system for identification of video content in a video signal is provided via a sound track audio signal. The audio signal is processed with filtering and non linear transformations to extract voice signals from the sound track channel. The extracted voice signals are coupled to a speech recognition system to provide in text form, the words of the video content, which is later compared with a reference library of words or dialog from known video programs or movies. Other attributes of the video signal or transport stream may be combined with closed caption data or closed caption text for identification purposes. Example attributes include DVS/SAP information, time code information, histograms, and or rendered video or pictures.

Type: Application

Filed: June 30, 2010

Publication date: January 5, 2012

Inventor: Ronald Quan
Method and Apparatus for Identifying Video Program Material or Content via Frequency Translation or Modulation Schemes

Publication number: 20120005701

Abstract: A system for identification of video content in a video signal is provided via a sound track audio signal. The audio signal is processed with filtering, frequency translation, and or non linear transformations to extract voice signals from the sound track channel. The extracted voice signals are coupled to a speech recognition system to provide in text form, the words of the video content, which is later compared with a reference library of words or dialog from known video programs or movies. Other attributes of the video signal or transport stream may be combined with closed caption data or closed caption text for identification purposes. Example attributes include DVS/SAP information, time code information, histograms, and or rendered video or pictures.

Type: Application

Filed: June 30, 2010

Publication date: January 5, 2012

Inventor: Ronald Quan
MULTI-MODAL CONVERSION TOOL FOR FORM-TYPE APPLICATIONS

Publication number: 20110321008

Abstract: GUI form code comprising a set of GUI elements can be imported. A user interface description can be generated from the GUI form code that has an element corresponding to each GUI element. For each user interface element converted from a corresponding to one of the GUI elements, a user interface element type can be determined as can temporal associations between the user interface elements. A Conversation User Interface (CUI) code corresponding to the GUI form code can be created from the user interface description. When creating the CUI code for each of the user interface elements, different and rules to convert the user interface element into CUI code can be used depending on a user interface element type of the user interface element being converted. When creating the CUI code, the user interface elements can be temporally ordered based on the pre-determined spatio-temporal associations between the graphical user interface (GUI) elements.

Type: Application

Filed: June 28, 2010

Publication date: December 29, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: ALBEE JHONEY, PRAVEEN K. VAIDYANATHAN
METHOD AND APPARATUS FOR FUSING VOICED PHONEME UNITS IN TEXT-TO-SPEECH

Publication number: 20110320199

Abstract: According to one embodiment, an apparatus for fusing voiced phoneme units in Text-To-Speech, includes a reference unit selection module configured to select a reference unit from the plurality of units based on pitch cycle information of the each unit and the number of pitch cycles of the target segment. The apparatus includes a template creation module configured to create a template based on the reference unit selected by the reference unit selection module and the number of pitch cycles of the target segment, wherein the number of pitch cycles of the template is same with that of pitch cycles of the target segment. The apparatus includes a pitch cycle alignment module configured to align pitch cycles of each unit of the plurality of units except the reference unit with pitch cycles of the template by using a dynamic programming algorithm.

Type: Application

Filed: July 15, 2011

Publication date: December 29, 2011

Inventors: Jian Luan, Jian Li
INTERACTIVE ENVIRONMENT FOR PERFORMING ARTS SCRIPTS

Publication number: 20110320198

Abstract: One or more embodiments present a script to a user in an interactive script environment. A digital representation of a manuscript is analyzed. This digital representation includes a set of roles and a set of information associated with each role in the set of roles. An active role in the set of roles that is associated with a given user is identified based on the analyzing. At least a portion of the manuscript is presented to the given user via a user interface. The portion includes at least a subset of information in the set of information. Information within the set of information that is associated with the active role is presented in a visually different manner than information within the set of information that is associated with a non-active role, which is a role that is associated with a user other than the given user.

Type: Application

Filed: June 27, 2011

Publication date: December 29, 2011

Inventor: Randall Lee THREEWITS
System and Method for Latency Reduction for Automatic Speech Recognition Using Partial Multi-Pass Results

Publication number: 20110313764

Abstract: A system and method is provided for reducing latency for automatic speech recognition. In one embodiment, intermediate results produced by multiple search passes are used to update a display of transcribed text.

Type: Application

Filed: August 27, 2011

Publication date: December 22, 2011

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Michiel Adriaan Unico Bacchiani, Brian Scott Amento
ENHANCEMENT OF SIMULTANEOUS MULTI-USER REAL-TIME SPEECH RECOGNITION SYSTEM

Publication number: 20110305326

Abstract: This invention involves additional details and uses for the invention described in U.S. Pat. No. 7,047,192 Simultaneous Multi-User Real-Time Speech Recognition System file by Poirier. The patent granted to Poirier teaches a platform based on audio events on which can be built larger applications to solve problems of capturing and transcribing human conversations. U.S. Pat. No. 7,047,192 also explains and teaches that indexing, cataloging, editing, and searching audio is possible using a browser to find specific content within text which is then directly linked with the relative audio event. More specifically it describes how this patent can be used as a building block approach to provide functionality for real-time automatic speech recognition systems that a scalable from a single user to hundreds and potentially thousands of users having conversations.

Type: Application

Filed: March 20, 2011

Publication date: December 15, 2011

Inventors: JAMEY POIRIER, MARK HANEGRAAFF, DARRELL POIRIER
SPEECH RECOGNITION INVOLVING A MOBILE DEVICE

Publication number: 20110307254

Abstract: A system and method of speech recognition involving a mobile device. Speech input is received (202) on a mobile device (102) and converted (204) to a set of phonetic symbols. Data relating to the phonetic symbols is transferred (206) from the mobile device over a communications network (104) to a remote processing device (106) where it is used (208) to identify at least one matching data item from a set of data items (114). Data relating to the at least one matching data item is transferred (210) from the remote processing device to the mobile device and presented (214) thereon.

Type: Application

Filed: December 10, 2009

Publication date: December 15, 2011

Inventors: Melvyn Hunt, John Bridle
INTELLIGENT TEXT MESSAGE-TO-SPEECH SYSTEM AND METHOD FOR VISUAL VOICE MAIL

Publication number: 20110300833

Abstract: A visual voicemail system can convert visual voicemail message content to an alternate format based on the location of the recipient device, whether and how the recipient device is in motion, a priority of the message content, user preferences, or other criteria. Alternately, a recipient wireless device may also convert content to an alternate format based on the similar criteria. Content may be presented automatically to a user on recipient device based on such criteria. Content may be converted from audio to text, text to audio, or from any format to any other format. Location, motion data, user preferences, etc. may be obtained from a location based service system, a recipient wireless device, or any other source.

Type: Application

Filed: June 8, 2010

Publication date: December 8, 2011

Applicant: AT&T MOBILITY II LLC

Inventor: Venson M. Shaw
SPEECH RECOGNITION PROCESSING SYSTEM AND SPEECH RECOGNITION PROCESSING METHOD

Publication number: 20110301952

Abstract: The present invention provides a speech recognition processing system in which speech recognition processing is executed parallelly by plural speech recognizing units. Before text data as the speech recognition result is output from each of the speech recognizing units, information indicating each speaker is parallelly displayed on a display in emission order of each speech. When the text data is output from each of the speech recognizing units, the text data is associated with the information indicating each speaker and the text data is displayed.

Type: Application

Filed: March 25, 2010

Publication date: December 8, 2011

Applicant: NEC CORPORATION

Inventors: Takafumi Koshinaka, Masahiko Hamanaka
Systems and Methods for Selectively Disabling One or More Features of a Mobile Access Device and/or a Vehicle Associated with the Mobile Access Device

Publication number: 20110295458

Abstract: An exemplary method includes a vehicle operation safety system detecting an operating parameter of a vehicle, detecting that a mobile access device is located within a predefined vicinity of the vehicle, and disabling one or more features of the mobile access device in response to the detecting of the operating parameter of the vehicle and the detecting that the mobile access device is located within the predefined vicinity of the vehicle. Corresponding methods and systems are also disclosed.

Type: Application

Filed: May 28, 2010

Publication date: December 1, 2011

Applicant: VERIZON VIRGINIA

Inventor: Nicole Ma Ellen Halsey-Fenderson
SYSTEM AND METHOD FOR WORKERS COMPENSATION DATA PROCESSING AND TRACKING

Publication number: 20110295623

Abstract: The present invention relates to a workers compensation related insurance complaint management, processing and reporting system. According to some embodiments, an insurance complaint is received at a conversion system for processing. The received information may be converted to a specified format to populate an interactive web form. The web form may be augmented by various third party data sources and information may be additionally pre-filled through the conversion process. Automated complaint management reminders are generated by the system.

Type: Application

Filed: May 26, 2010

Publication date: December 1, 2011

Applicant: Hartford Fire Insurance Company

Inventors: Jill Rich Behringer, Marlene G. Chickerella, JulieeAnn McCollum
VOICE STREAM AUGMENTED NOTE TAKING

Publication number: 20110288863

Abstract: Voice stream augmented note taking may be provided. An audio stream associated with at least one speaker may be recorded and converted into text chunks. A text entry may be received from a user, such as in an electronic document. The text entry may be compared to the text chunks to identify matches, and the matching text chunks may be displayed to the user for selection.

Type: Application

Filed: May 21, 2010

Publication date: November 24, 2011

Applicant: Microsoft Corporation

Inventor: David John Rasmussen
Methods and Systems for Performing Synchronization of Audio with Corresponding Textual Transcriptions and Determining Confidence Values of the Synchronization

Publication number: 20110288862

Abstract: Methods and systems for performing audio synchronization with corresponding textual transcription and determining confidence values of the timing-synchronization are provided. Audio and a corresponding text (e.g., transcript) may be synchronized in a forward and reverse direction using speech recognition to output a time-annotated audio-lyrics synchronized data. Metrics can be computed to quantify and/or qualify a confidence of the synchronization. Based on the metrics, example embodiments describe methods for enhancing an automated synchronization process to possibly adapted Hidden Markov Models (HMMs) to the synchronized audio for use during the speech recognition. Other examples describe methods for selecting an appropriate HMM for use.

Type: Application

Filed: May 18, 2010

Publication date: November 24, 2011

Inventor: Ognjen Todic
METHOD AND SYSTEM FOR ASSISTING INPUT OF TEXT INFORMATION FROM VOICE DATA

Publication number: 20110282664

Abstract: Methods and systems for providing services and/or computing resources are provided. A method may include converting voice data into text data and tagging at least one portion of the text data in the text conversion file with at least one tag, the at least one tag indicating that the at least one portion of the text data includes a particular type of data. The method may also include displaying the text data on a display such that the at least one portion of text data is displayed with at least one associated graphical element indicating that the at least one portion of text data is associated with the at least one tag. The at least one portion of text data may be a selectable item on the display allowing a user interfacing with the display to select the at least one portion of text data in order to apply the at least one portion of text data to an application.

Type: Application

Filed: May 14, 2010

Publication date: November 17, 2011

Applicant: FUJITSU LIMITED

Inventors: Hideaki Tanioka, Daisuke Ito, Hidenobu Ito
Clinical Data Reconciliation as Part of a Report Generation Solution

Publication number: 20110282687

Abstract: An automated system updates electronic medical records (EMRs) based on dictated reports, without requiring manual data entry into on-screen forms. A dictated report is transcribed by an automatic speech recognizer, and facts are extracted from the report and stored in encoded form. Information from a patient's report is also stored in encoded form. The resulting encoded information from the report and EMR are reconciled with each other, and changes to be made to the EMR are identified based on the reconciliation. The identified changes are made to the EMR automatically, without requiring manual data entry into the EMR.

Type: Application

Filed: February 28, 2011

Publication date: November 17, 2011

Inventor: Detlef Koll
GRAPHIC DISPLAY SYSTEM FOR ASSISTING VEHICLE OPERATORS

Publication number: 20110282522

Abstract: A system for converting audible air traffic control instructions for pilots operating from an air facility to textual format. The system may comprise a processor connected to a jack of the standard pilot headset and a separate portable display screen connected to the processor. The processor may have a language converting functionality which can recognize traffic control nomenclature and display messages accordingly. Displayed text may be limited to information intended for a specific aircraft. The display may show hazardous discrepancies between authorized altitudes and headings and actual altitudes and headings. The display may be capable of correction by the user, and may utilize Global Positioning System (GPS) to obtain appropriate corrections. The system may date and time stamp communications and hold the same in memory. The system may have computer style user functions such as scrollability and operating prompts.

Type: Application

Filed: May 13, 2010

Publication date: November 17, 2011

Inventors: Robert S. Prus, Konrad Robert Sliwowski
VOICE-TO-EXPRESSIVE TEXT

Publication number: 20110276327

Abstract: A method including receiving a vocal input including words spoken by a user; determining vocal characteristics associated with the vocal input mapping the vocal characteristics to textual characteristics; and generating a voice-to-expressive text that includes, in addition to text corresponding to the words spoken by the user, a textual representation of the vocal characteristics based on the mapping.

Type: Application

Filed: May 28, 2010

Publication date: November 10, 2011

Applicant: SONY ERICSSON MOBILE COMMUNICATIONS AB

Inventor: Eral Foxenland
APPLICATION SERVER FOR REDUCING AMBIANCE NOISE IN AN AUSCULTATION SIGNAL, AND FOR RECORDING COMMENTS WHILE AUSCULTATING A PATIENT WITH AN ELECTRONIC STETHOSCOPE

Publication number: 20110276328

Abstract: An application server for reducing ambiance noise in an auscultation signal, and for recording comments while auscultating a patient with an electronic stethoscope This application server (AS) comprises: means (SPH) for receiving samples of a raw auscultation signal representing auscultation sounds mixed with ambiance sounds, this raw auscultation signal being transmitted by a first microphone (M1) embedded in a stethoscope (ES), means (SPH) for receiving samples of an ambiance signal transmitted by a second microphone (M2) in a phone (IPP1), means (ASE) for processing the samples of the auscultation signal and the samples of the ambiance signal for generating an auscultation signal without ambiance sounds, means (LBM) for sending the auscultation signal without ambiance sounds to at least the headset of said stethoscope (ES), means (VRM) for recognizing vocal sounds in the ambiance signal, and converting these vocal sounds into text for storing comments into a database (DB).

Type: Application

Filed: July 10, 2009

Publication date: November 10, 2011

Inventors: Raymond Gass, Michel Le Creff
METHOD AND SYSTEM FOR OPERATIONAL IMPROVEMENTS IN DISPATCH CONSOLE SYSTEMS IN A MULTI-SOURCE ENVIRONMENT

Publication number: 20110276326

Abstract: A method and system for operational improvements in a dispatch console in a multi-source environment includes receiving (310) a plurality of audio streams simultaneously from a plurality of mobile devices, transcribing received audio streams by the means of speech-to-text conversion, presenting real-time transcriptions to the user and determining (320) if a first keyword is present in at least one of the plurality of audio and/or text streams. Upon determining the presence of the first keyword, the dispatch console automatically performs (330) at least one predefined dispatch console operation from a list of predefined dispatch console operations. The dispatch console further receives (340) a second keyword based on determining the presence of the first keyword and checks (350) for the presence of the second keyword within the audio and/or text streams thereby enabling additional automated dispatch console operations.

Type: Application

Filed: May 6, 2010

Publication date: November 10, 2011

Applicant: Motorola, Inc.

Inventors: Arthur L. Fumarolo, Mark Shahaf
Training A Transcription System

Publication number: 20110276325

Abstract: According to certain embodiments, training a transcription system includes accessing recorded voice data of a user from one or more sources. The recorded voice data comprises voice samples. A transcript of the recorded voice data is accessed. The transcript comprises text representing one or more words of each voice sample. The transcript and the recorded voice data are provided to a transcription system to generate a voice profile for the user. The voice profile comprises information used to convert a voice sample to corresponding text.

Type: Application

Filed: May 5, 2010

Publication date: November 10, 2011

Applicant: Cisco Technology, Inc.

Inventors: Todd C. Tatum, Michael A. Ramalho, Paul M. Dunn, Shantanu Sarkar, Tyrone T. Thorsen, Alan D. Gatzke
ACCELERATED INSTANT REPLAY FOR CO-PRESENT AND DISTRIBUTED MEETINGS

Publication number: 20110267419

Abstract: Techniques for recording and replay of a live conference while still attending the live conference are described. A conferencing system includes a user interface generator, a live conference processing module, and a replay processing module. The user interface generator is configured to generate a user interface that includes a replay control panel and one or more output panels. The live conference processing module is configured to extract information included in received conferencing data that is associated with one or more conferencing modalities, and to display the information in the one or more output panels in a live manner (e.g., as a live conference). The replay processing module is configured to enable information associated with the one or more conferencing modalities corresponding to a time of the conference session prior to live to be presented at a desired rate, possibly different from the real-time rate, if a replay mode is selected in the replay control panel.

Type: Application

Filed: April 30, 2010

Publication date: November 3, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Kori Inkpen Quinn, Rajesh Hegde, Zhengyou Zhang, John Tang, Sasa Junuzovic, Christopher Brooks
VOICE AD INTERACTIONS AS AD CONVERSIONS

Publication number: 20110271194

Abstract: This specification describes technologies relating to content presentation. In general, one aspect of the subject matter described in this specification can be embodied in methods that include the actions of presenting a content item to a user; receiving a user input indicating a voice interaction; receiving a voice input from the user; transmitting the voice input to a content system; receiving a command responsive to the voice input; and executing, using one or more processors, the command including modifying the content item. Other embodiments of this aspect include corresponding systems, apparatus, and computer program products.

Type: Application

Filed: April 29, 2010

Publication date: November 3, 2011

Applicant: GOOGLE INC.

Inventors: Jennifer W. Lin, Ping Wu
REAL-TIME SPEECH-TO-TEXT CONVERSION IN AN AUDIO CONFERENCE SESSION

Publication number: 20110270609

Abstract: Various embodiments of systems, methods, and computer programs are disclosed for providing real-time resources to participants in an audio conference session. One embodiment is a method for providing real-time resources to participants in an audio conference session via a communication network. One such method comprises: a conferencing system establishing an audio conference session between a plurality of computing devices via a communication network, each computing device generating a corresponding audio stream comprising a speech signal; and in real-time during the audio conference session, a server: receiving and processing the audio streams to determine the speech signals; extracting words from the speech signals; analyzing the extracted words to determine a relevant keyword being discussed in the audio conference session; identifying a resource related to the relevant keyword; and providing the resource to one or more of the computing devices.

Type: Application

Filed: April 30, 2010

Publication date: November 3, 2011

Applicant: American Teleconferncing Services Ltd.

Inventors: Boland T. Jones, David Michael Guthrie, Laurence Schaefer, J. Douglas Martin
TRANSCRIPTION SYSTEMS AND METHODS

Publication number: 20110269429

Abstract: A communication device may participate in telephone calls. The communication device may allow a user to request transcription of a telephone call by prompting the user when the telephone call is completed. The communication device may display a call history user interface and, in response to a selection of a telephone call from the call history user interface, may request transcription of the selected telephone call. The communication device may include a dedicated transcription button that, when pressed, causes audio content of a telephone call to be sent to a transcription service. The communication device may display a preferences user interface via which a user may elect to have all incoming and outgoing telephone calls transcribed, all incoming and outgoing telephone calls to/from selected contacts transcribed, and/or have the communication device prompt the user about transcription when each telephone call is completed.

Type: Application

Filed: November 23, 2009

Publication date: November 3, 2011

Applicant: SPEECHINK, INC.

Inventor: Konstantin Othmer
METHODS AND SYSTEMS FOR TRAINING DICTATION-BASED SPEECH-TO-TEXT SYSTEMS USING RECORDED SAMPLES

Publication number: 20110264451

Abstract: A method and apparatus useful to train speech recognition engines is provided. Many of today's speech recognition engines require training to particular individuals to accurately convert speech to text. The training requires the use of significant resources for certain applications. To alleviate the resources, a trainer is provided with the text transcription and the audio file. The trainer updates the text based on the audio file. The changes are provided to the speech recognition to train the recognition engine and update the user profile. In certain aspects, the training is reversible as it is possible to over train the system such that the trained system is actually less proficient.

Type: Application

Filed: April 21, 2011

Publication date: October 27, 2011

Applicant: nVoq Incorporated

Inventors: Jeffrey Hoepfinger, David Mondragon
Interactive Media Device and Method

Publication number: 20110265004

Abstract: An interactive media device and method allows a user to select a format by which stored text of a book or stored audio of a book or combinations thereof are presented to the user. For example, if the user wishes to read the book, the user may select display of the text of a screen for reading. Alternatively, the user may select audio of the text resulting in the text being converted so that audio sound of the text is provided for listening. Interactive book or place marking allows the user via a device interface to mark a place within the book. Upon returning the user may continue from that place, either by reading displayed text or by listening to audio. Place marking, and presentation of the content either visually or audibly may continue, at the user's selection, until the book is completed.

Type: Application

Filed: April 24, 2010

Publication date: October 27, 2011

Inventor: Anthony G. Sitko
MONITORING INMATE CALLS USING SPEECH RECOGNITION

Publication number: 20110261941

Abstract: A system and method for monitoring telephone activity and conversation content in a correctional facility comprises providing a communicative connection between a caller and a recipient, alerting at least one of the caller and the recipient that the communications may be recorded, delivering the conversation between the caller and the recipient over the communicative connection and storing the conversation into a call record memory. After the communicative connection has been terminated, speech recognition software is executed to identify a plurality of conversation words within the call record memory. By comparing the conversation words with a database of trigger words, a determination can be made as to whether the conversation is of interest to the correctional facility. Based on that comparison step, a detection response is executed.

Type: Application

Filed: April 27, 2010

Publication date: October 27, 2011

Inventors: Jay Walters, Randy L. Reeves, William L. Pope
VEHICLE USER INTERFACE SYSTEMS AND METHODS

Publication number: 20110257973

Abstract: A control system for mounting in a vehicle and for providing information to a portable electronic device for processing by the portable electronic device is shown and described. The control system includes a first interface for communicating with the portable electronic device and a memory device. The control system also includes a processing circuit communicably coupled to the first interface and the memory device, the processing circuit configured to extract information from the memory device and to provide the information to the first interface so that the first interface communicates the information to the portable electronic device. The processing circuit is further configured to determine the capabilities of the portable electronic device based on data received from the portable electronic device via the first interface and to determine whether or not to communicate the information to the portable electronic device based on the determined capabilities.

Type: Application

Filed: January 14, 2011

Publication date: October 20, 2011

Inventors: Richard J. Chutorash, Elisabet Anderson, Rodger W. Eich, Jeffrey Golden, Philip J. Vanderwall, Michael J. Sims
Camera-Assisted Noise Cancellation and Speech Recognition

Publication number: 20110257971

Abstract: Methods, system, and articles are described herein for receiving an audio input and a facial image sequence for a period of time, in which the audio input includes speech input from multiple speakers. The audio input is extracted based on the received facial image sequence to extract a speech input of a particular speaker.

Type: Application

Filed: April 14, 2010

Publication date: October 20, 2011

Applicant: T-MOBILE USA, INC.

Inventor: Andrew R. Morrison
COMPENSATION OF INTRA-SPEAKER VARIABILITY IN SPEAKER DIARIZATION

Publication number: 20110251843

Abstract: A method, system, and computer program product compensation of intra-speaker variability in speaker diarization are provided. The method includes: dividing a speech session into segments of duration less than an average duration between speaker change; parameterizing each segment by a time dependent probability density function supervector, for example, using a Gaussian Mixture Model; computing a difference between successive segment supervectors; and computing a scatter measure such as a covariance matrix of the difference as an estimate of intra-speaker variability. The method further includes compensating the speech session for intra-speaker variability using the estimate of intra-speaker variability.

Type: Application

Filed: April 8, 2010

Publication date: October 13, 2011

Applicant: International Business Machines Corporation

Inventor: Hagai Aronowitz
SYSTEM AND METHOD FOR FACILITATING REAL-TIME COLLABORATION IN A CUSTOMER SUPPORT ENVIRONMENT

Publication number: 20110251971

Abstract: An embodiment of the invention comprises a real-time collaborative technical support (RTCTS) system that may automatically generate and/or maintain social networks that may be dynamically evolving. The social networks may be based on the output of at least one multi-modal classification algorithm. These outputs may be occurring in real-time.

Type: Application

Filed: April 8, 2010

Publication date: October 13, 2011

Applicant: International Business Machines Corporation

Inventors: Timothy J. Bethea, Neil H. Boyette, Isaac K. Cheng, Vikas Krishna, Yolanda A. Rankin, Yongshin Yu
Indicia to indicate a dictation application is capable of receiving audio

Publication number: 20110246194

Abstract: A client station having access to an application is provided. The application has at least one indicia having a first configuration and a second configuration different from the first configuration. The second configuration indicating the application is able to accept input.

Type: Application

Filed: March 30, 2010

Publication date: October 6, 2011

Applicant: nVoq Incorporated

Inventors: Rebecca Heins, Edward Kizhner
Method and System for Automatic Call Tracking and Analysis

Publication number: 20110243311

Abstract: Methods and systems for automatic phone call tracking and analysis of the content and outcomes of a call are provided. These systems may provide businesses with the ability to track and view analytics of the number and various outcomes of calls, thereby providing up-to-date real-time analysis of the automatically-generated results of client interactions with staff answering the phones. Methods and systems in accordance with the present invention quantitatively and objectively analyze staff performance and marketing return on investment (ROI), and track patient demand across various procedures. This may automatically provide information on the number of calls with various outcomes, e.g., the customer booked an appointment, the customer hung up while on hold, the customer was connected with voicemail, the customer left a message on voicemail, the customer is an existing client, etc. Other automatically-detected aspects of phone call contents are provided.

Type: Application

Filed: March 30, 2010

Publication date: October 6, 2011

Inventor: Grant L. Aldrich
METHOD, APPARATUS, AND PROGRAM FOR CERTIFYING A VOICE PROFILE WHEN TRANSMITTING TEXT MESSAGES FOR SYNTHESIZED SPEECH

Publication number: 20110246197

Abstract: A mechanism is provided for authenticating and using a personal voice profile. The voice profile may be issued by a trusted third party, such as a certification authority. The personal voice profile may include information for generating a digest or digital signature for text messages. A speech synthesis system may speak the text message using the voice characteristics, such as prosodic characteristics, only if the voice profile is authenticated and the text message is valid and free of tampering.

Type: Application

Filed: June 17, 2011

Publication date: October 6, 2011

Applicant: Nuance Communications, Inc.

Inventors: Rafael Graniello Cabezas, Jason Eric Moore, Elizabeth Salvia
HIERARCHICAL QUICK NOTE TO ALLOW DICTATED CODE PHRASES TO BE TRANSCRIBED TO STANDARD CLAUSES

Publication number: 20110246195

Abstract: A dictation system that allows using trainable code phrases is provided. The dictation system operates by receiving audio and recognizing the audio as text. The text/audio may contain code phrases that are identified by a comparator that matches the text/audio and replaces the code phrase with a standard clause that is associated with the code phrase. The database or memory containing the code phrases is loaded with matched standard clauses that may be identified to provide a hierarchal system such that certain code phrases may have multiple meanings depending on the user.

Type: Application

Filed: March 21, 2011

Publication date: October 6, 2011

Applicant: nVoq Incorporated

Inventors: Charles Corfield, Brian Marquette, David Mondragon, Rebecca Heins
System and Method for Automatic Storage and Retrieval of Emergency Information

Publication number: 20110230159

Abstract: A vehicle communication system includes a computer processor in communication with a memory circuit, a transceiver in communication with the processor and operable to communicate with one or more wireless devices, and one or more storage locations storing one or more pieces of emergency contact information. In this illustrative system, the processor is operable to establish communication with a first wireless device through the transceiver. Upon detection of an emergency event by at least one vehicle based sensor system, the vehicle communication system is operable to contact an emergency operator. The vehicle communication system is further operable to display one or more of the one or more pieces of emergency contact information in a selectable manner. Upon selection of one of the one or more pieces of emergency contact information, the vehicle computing system places a call to a phone number associated with the selected emergency contact.

Type: Application

Filed: March 19, 2010

Publication date: September 22, 2011

Applicant: FORD GLOBAL TECHNOLOGIES, LLC

Inventor: David Anthony Hatton

prev 1 2 3 4 5 6 7 8 9 … next