Speech To Text Systems (epo) Patents (Class 704/E15.043)

E Subclasses

Speech recognition depending on application context, e.g., in a computer, etc. (epo) (Class 704/E15.044)

Systems using speech recognizers (epo) (Class 704/E15.045)

INTERACTIVE SPEECH RECOGNITION

Publication number: 20130132079

Abstract: A first plurality of audio features associated with a first utterance may be obtained. A first text result associated with a first speech-to-text translation of the first utterance may be obtained based on an audio signal analysis associated with the audio features, the first text result including at least one first word. A first set of audio features correlated with at least a first portion of the first speech-to-text translation associated with the at least one first word may be obtained. A display of at least a portion of the first text result that includes the at least one first word may be initiated. A selection indication may be received, indicating an error in the first speech-to-text translation, the error associated with the at least one first word.

Type: Application

Filed: November 17, 2011

Publication date: May 23, 2013

Applicant: Microsoft Corporation

Inventors: Muhammad Shoaib B. Sehgal, Mirza Muhammad Raza
SYSTEM AND METHOD FOR CROWD-SOURCED DATA LABELING

Publication number: 20130132080

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for crowd-sourced data labeling. The system requests a respective response from each of a set of entities. The set of entities includes crowd workers. Next, the system incrementally receives a number of responses from the set of entities until at least one of an accuracy threshold is reached and m responses are received, wherein the accuracy threshold is based on characteristics of the number of responses. Finally, the system generates an output response based on the number of responses.

Type: Application

Filed: November 18, 2011

Publication date: May 23, 2013

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Jason Williams, Tirso Alonso, Barbara B. Hollister, Ilya Dan Melamed
METHOD AND APPARATUS FOR PROCESSING SCRIPTS AND RELATED DATA

Publication number: 20130124202

Abstract: Provided in some embodiments is a method including receiving ordered script words are indicative of dialogue words to be spoken, receiving audio data corresponding to at least a portion of the dialogue words to be spoken and including timecodes associated with dialogue words, generating a matrix of the ordered script words versus the dialogue words, aligning the matrix to determine hard alignment points that include matching consecutive sequences of ordered script words with corresponding sequences of dialogue words, partitioning the matrix of ordered script words into sub-matrices bounded by adjacent hard-alignment points and including corresponding sub-sets the script and dialogue words between the hard-alignment points, and aligning each of the sub-matrices.

Type: Application

Filed: May 28, 2010

Publication date: May 16, 2013

Inventor: Walter W. Chang
Displaying Sound Indications On A Wearable Computing System

Publication number: 20130124204

Abstract: Example methods and systems for displaying one or more indications that indicate (i) the direction of a source of sound and (ii) the intensity level of the sound are disclosed. A method may involve receiving audio data corresponding to sound detected by a wearable computing system. Further, the method may involve analyzing the audio data to determine both (i) a direction from the wearable computing system of a source of the sound and (ii) an intensity level of the sound. Still further, the method may involve causing the wearable computing system to display one or more indications that indicate (i) the direction of the source of the sound and (ii) the intensity level of the sound.

Type: Application

Filed: April 17, 2012

Publication date: May 16, 2013

Applicant: GOOGLE INC.

Inventors: Adrian Wong, Xiaoyu Miao
NETWORK-BASED BACKGROUND EXPERT

Publication number: 20130124189

Abstract: A system and methodology that provides a network-based, e.g., cloud-based, background expert for predicting and/or accomplishing a user's goals is disclosed herein. Moreover, the system monitors, in the background, user generated data and/or publicly available data to determine and/or infer a user's goal, with or without an active indication/request from the user. Typically, the user-generated data can include user conversations, such as, but not limited to, speech data in a voice call, text messages, chat dialogues, etc. Further, the system identifies an action or task that facilitates accomplishment of the user goal in real-time. Moreover, the system can automatically perform the action/task and/or request user authorization prior to performing the action/task.

Type: Application

Filed: November 10, 2011

Publication date: May 16, 2013

Applicant: AT&T INTELLECTUAL PROPERTY I, LP

Inventor: CHRISTOPHER BALDWIN
Aligning Scripts To Dialogues For Unmatched Portions Based On Matched Portions

Publication number: 20130124203

Abstract: Provided in some embodiments is a computer implemented method that includes providing script data including script words indicative of dialogue words to be spoken, providing recorded dialogue audio data corresponding to at least a portion of the dialogue words to be spoken, wherein the recorded dialogue audio data includes timecodes associated with recorded audio dialogue words, matching at least some of the script words to corresponding recorded audio dialogue words to determine alignment points, determining that a set of unmatched script words are accurate based on the matching of at least some of the script words matched to corresponding recorded audio dialogue words, generating time-aligned script data including the script words and their corresponding timecodes and the set of unmatched script words determined to be accurate based on the matching of at least some of the script words matched to corresponding recorded audio dialogue words.

Type: Application

Filed: May 28, 2010

Publication date: May 16, 2013

Inventors: Jerry R. Scoggins, II, Walter W. Chang, David A. Kuspa
Remote Laboratory Gateway

Publication number: 20130117019

Abstract: A remote laboratory gateway enables a plurality of students to access and control a laboratory experiment remotely. Access is provided by an experimentation gateway, which is configured to provide secure access to the experiment via a network-centric, web-enabled interface graphical user interface. Experimental hardware is directly controlled by an experiment controller, which is communicatively coupled to the experimentation gateway and which may be a software application, a standalone computing device, or a virtual machine hosted on the experimentation gateway. The remote laboratory of the present specification may be configured for a software-as-a-service business model.

Type: Application

Filed: November 7, 2011

Publication date: May 9, 2013

Inventors: David Akopian, Arsen Melkonyan, Murillo Pontual, Grant Huang, Andreas Robert Gampe
MESSAGE AND VEHICLE INTERFACE INTEGRATION SYSTEM AND METHOD

Publication number: 20130117021

Abstract: A method and system uses an integration application to extract an information feature from a message and to provide the information feature to a vehicle interface device which acts on the information feature to provide a service. The extracted information feature may be automatically acted upon, or may be outputted for review, editing, and/or selection before being acted on. The vehicle interface device may include a navigation system, infotainment system, telephone, and/or a head unit. The message may be received by the vehicle interface device or from a portable or remote device in linked communication with the vehicle interface device. The message may be a voice-based or text-based message. The service may include placing a call, sending a message, or providing navigation instructions using the information feature. An off-board or back-end service provider in communication with the integration application may extract and/or transcribe the information feature and/or provide a service.

Type: Application

Filed: October 31, 2012

Publication date: May 9, 2013

Applicant: GM Global Technolog Operations LLC

Inventor: GM Global Technology Operations LLC
VOICE CONTENT TRANSCRIPTION DURING COLLABORATION SESSIONS

Publication number: 20130117018

Abstract: A method, computer program product, and system for voice content transcription during collaboration sessions is described. A method may comprise receiving an indication to provide one or more real-time voice content-to-text content transcriptions to a first collaboration session participant. The one or more real-time voice content-to-text content transcriptions may correspond to voice content of a second collaboration session participant in one or more collaboration sessions including the first collaboration session participant and the second collaboration session participant.

Type: Application

Filed: November 3, 2011

Publication date: May 9, 2013

Applicant: International Business Machines Corporation

Inventors: Patrick Joseph O'Sullivan, Edith Helen Stern, Barry E. Willner, Hong Bing Zhang
NATURAL LANGUAGE CALL ROUTER

Publication number: 20130110510

Abstract: A natural language call router forwards an incoming call from a caller to an appropriate destination. The call router has a speech recognition mechanism responsive to words spoken by a caller for producing recognized text corresponding to the spoken words. A robust parsing mechanism is responsive to the recognized text for detecting a class of words in the recognized text. The class is defined as a group of words having a common attribute. An interpreting mechanism is responsive to the detected class for determining the appropriate destination for routing the call.

Type: Application

Filed: October 28, 2011

Publication date: May 2, 2013

Applicant: Cellco Partnership d/b/a Verizon Wireless

Inventors: Veronica Klein, Deborah Washington Brown
DISTRIBUTED USER INPUT TO TEXT GENERATED BY A SPEECH TO TEXT TRANSCRIPTION SERVICE

Publication number: 20130110509

Abstract: A particular method includes receiving, at a representational state transfer endpoint device, a first user input related to a first speech to text conversion performed by a speech to text transcription service. The method also includes receiving, at the representational state transfer endpoint device, a second user input related to a second speech to text conversion performed by the speech to text transcription service. The method includes processing of the first user input and the second user input at the representational state transfer endpoint device to generate speech to text adjustment information.

Type: Application

Filed: October 28, 2011

Publication date: May 2, 2013

Applicant: Microsoft Corporation

Inventors: Jeremy Edward Cath, Timothy Edwin Harris, Marc Mercuri, James Oliver Tisdale, III
ELECTRONIC DEVICE AND CONTROL METHOD THEREOF

Publication number: 20130110508

Abstract: An electronic device and a control method are provided. The electronic device includes a voice receiver which receives a voice of a user; a signal processor which performs signal processing on the received voice; a communicator which communicates with a first external device; and a controller which determines a text corresponding to the received voice of the user, and controls the communicator to transmit the signal processed voice and the determined text to the first external device.

Type: Application

Filed: September 5, 2012

Publication date: May 2, 2013

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Eun-sang BAK, Ju-rack CHAE, Jae-hwan KIM, Yu LIU
SYSTEMS, DEVICES AND METHODS FOR LIST DISPLAY AND MANAGEMENT

Publication number: 20130103397

Abstract: Exemplary embodiments provide systems, devices and methods that allow creation and management of lists of items in an integrated manner on an interactive graphical user interface. A user may speak a plurality of list items in a natural unbroken manner to provide an audio input stream into an audio input device. Exemplary embodiments may automatically process the audio input stream to convert the stream into a text output, and may process the text output into one or more n-grams that may be used as list items to populate a list on a user interface.

Type: Application

Filed: October 21, 2011

Publication date: April 25, 2013

Applicant: WAL-MART STORES, INC.

Inventors: Dion Almaer, Bernard Paul Cousineau, Ben Galbraith
DETERMINING AND CONVEYING CONTEXTUAL INFORMATION FOR REAL TIME TEXT

Publication number: 20130103399

Abstract: Aspects relate to machine recognition of human voices in live or recorded audio content, and delivering text derived from such live or recorded content as real time text, with contextual information derived from characteristics of the audio. For example, volume information can be encoded as larger and smaller font sizes. Speaker changes can be detected and indicated through text additions, or color changes to the font. A variety of other context information can be detected and encoded in graphical rendition commands available through RTT, or by extending the information provided with RTT packets, and processing that extended information accordingly for modifying the display of the RTT text content.

Type: Application

Filed: October 21, 2011

Publication date: April 25, 2013

Applicant: RESEARCH IN MOTION LIMITED

Inventor: Scott Peter GAMMON
MULTICHANNEL DEVICE UTILIZING A CENTRALIZED OUT-OF-BAND AUTHENTICATION SYSTEM (COBAS)

Publication number: 20130096916

Abstract: A multichannel security system is disclosed, which system is for granting and denying access to a host computer in response to a demand from an access-seeking individual and computer. The access-seeker has a peripheral device operative within an authentication channel to communicate with the security system. The access-seeker initially presents identification and password data over an access channel which is intercepted and transmitted to the security computer. The security computer then communicates with the access-seeker. A biometric analyzer—a voice or fingerprint recognition device—operates upon instructions from the authentication program to analyze the monitored parameter of the individual. In the security computer, a comparator matches the biometric sample with stored data, and, upon obtaining a match, provides authentication. The security computer instructs the host computer to grant access and communicates the same to the access-seeker, whereupon access is initiated over the access channel.

Type: Application

Filed: December 1, 2010

Publication date: April 18, 2013

Applicant: NETLABS.COM, INC.

Inventor: Ram Pemmaraju
Interactive Text Editing

Publication number: 20130085754

Abstract: A method for providing suggestions includes capturing audio that includes speech and receiving textual content from a speech recognition engine. The speech recognition engine performs speech recognition on the audio signal to obtain the textual content, which includes one or more passages. The method also includes receiving a selection of a portion of a first word in a passage in the textual content, wherein the passage includes multiple words, and retrieving a set of suggestions that can potentially replace the first word. At least one suggestion from the set of suggestions provides a multi-word suggestion for potentially replacing the first word. The method further includes displaying, on a display device, the set of suggestions, and highlighting a portion of the textual content, as displayed on the display device, for potentially changing to one of the suggestions from the set of suggestions.

Type: Application

Filed: September 14, 2012

Publication date: April 4, 2013

Applicant: Google Inc.

Inventors: Richard Z. Cohen, Marcus A. Foster, Luca Zanolin
Systems And Methods For Continual Speech Recognition And Detection In Mobile Computing Devices

Publication number: 20130085755

Abstract: The present application describes systems, articles of manufacture, and methods for continuous speech recognition for mobile computing devices. One embodiment includes determining whether a mobile computing device is receiving operating power from an external power source or a battery power source, and activating a trigger word detection subroutine in response to determining that the mobile computing device is receiving power from the external power source. In some embodiments, the trigger word detection subroutine operates continually while the mobile computing device is receiving power from the external power source. The trigger word detection subroutine includes determining whether a plurality of spoken words received via a microphone includes one or more trigger words, and in response to determining that the plurality of spoken words includes at least one trigger word, launching an application corresponding to the at least one trigger word included in the plurality of spoken words.

Type: Application

Filed: September 15, 2012

Publication date: April 4, 2013

Applicant: GOOGLE INC.

Inventors: Bjorn Erik Bringert, Pawel Pietryka, Peter John Hodgson, Simon Tickner, Henrique Penha, Richard Zarek Cohen, Luca Zanolin, Dave Burke
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD AND COMPUTER PROGRAM PRODUCT

Publication number: 20130080163

Abstract: According to an embodiment, an information processing apparatus includes a storage unit, a detector, an acquisition unit, and a search unit. The storage unit configured to store therein voice indices, each of which associates a character string included in voice text data obtained from a voice recognition process with voice positional information, the voice positional information indicating a temporal position in the voice data and corresponding to the character string. The acquisition unit acquires reading information being at least a part of a character string representing a reading of a phrase to be transcribed from the voice data played back. The search unit specifies, as search targets, character strings whose associated voice positional information is included in the played-back section information among the character strings included in the voice indices, and retrieves a character string including the reading represented by the reading information from among the specified character strings.

Type: Application

Filed: June 26, 2012

Publication date: March 28, 2013

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Nobuhiro Shimogori, Tomoo Ikeda, Kouji Ueno, Osamu Nishiyama, Hirokazu Suzuki, Manabu Nagao
User Query History Expansion for Improving Language Model Adaptation

Publication number: 20130080162

Abstract: Query history expansion may be provided. Upon receiving a spoken query from a user, an adapted language model may be applied to convert the spoken query to text. The adapted language model may comprise a plurality of queries interpolated from the user's previous queries and queries associated with other users. The spoken query may be executed and the results of the spoken query may be provided to the user.

Type: Application

Filed: September 23, 2011

Publication date: March 28, 2013

Applicant: Microsoft Corporation

Inventors: Shuangyu Chang, Michael Levit, Bruce Melvin Buntschuh
AUDIO TRANSCRIPTION GENERATOR AND EDITOR

Publication number: 20130066630

Abstract: A system for correcting errors in automatically generated audio transcriptions includes an audio recorder, a computerized transcription generator, a voice recording, a collection of link data, transcription text, an audio player, a system of cross linking, and a text editor including a text display with a cursor. The system permits a user to correct transcription errors using techniques of jump to position; show position; and track playback.

Type: Application

Filed: September 9, 2012

Publication date: March 14, 2013

Inventor: Kenneth D. Roe
CONFERENCED VOICE TO TEXT TRANSCRIPTION

Publication number: 20130058471

Abstract: Presented are systems and methods for creating a transcription of a conference call. The system joins an audio conference call with a device associated with a participant, of a plurality of participants joined to the conference through one or more associated devices. The system them creates a speech audio file corresponding to a portion of the participant's speech during the conference and converting contemporaneously, at the device, the speech audio file to a local partial transcript. The system then acquires a plurality of partial transcripts from at least one of the associated devices, so that the device can provide a complete transcript.

Type: Application

Filed: September 1, 2011

Publication date: March 7, 2013

Inventor: Juan Martin Garcia
OBSERVATION PLATFORM FOR PERFORMING STRUCTURED COMMUNICATIONS

Publication number: 20130060568

Abstract: Using structured communications within an organization or retail environment, the users establish a fabric of communications that allows external users of devices or applications to integrate in a way that is non-disruptive, measured and structured. An observation platform may be used for performing structured communications. A signal is received from a first communication device at a second communication device associated with a computer system, wherein the computer system is associated with an organization, wherein a first characteristic of the signal corresponds to an audible source and a second characteristic of the signal corresponds to information indicative of a geographic position of the first communication device.

Type: Application

Filed: October 31, 2012

Publication date: March 7, 2013

Inventors: Steven Paul Russell, Guy R. VanBuskirk, Andrew W. Kittler
COMMUNICATIONS SYSTEM WITH SPEECH-TO-TEXT CONVERSION AND ASSOCIATED METHODS

Publication number: 20130054237

Abstract: A communications system includes a first communications device cooperating with a second communications device. The first communications device multiplexes a digital speech message and a corresponding text message into a multiplexed signal, and wirelessly transmits the multiplexed signal. The second communications device wirelessly receives the multiplexed signal, de-multiplexes the multiplexed signal digital into the speech message and the corresponding text message, decodes the speech message for an audio output transducer, and operates a text processor on the corresponding text message for display. The corresponding text message is displayed in synchronization with the speech message output by the audio output transducer. A memory is coupled to the text processor for storing the text message, and the text processor is configured to display the stored text message.

Type: Application

Filed: August 25, 2011

Publication date: February 28, 2013

Applicant: Harris Corporation of the State of Delaware

Inventors: William N. Furman, John W. Nieto, Marcelo De Risio
System, Method and Computer Program Product for Dataset Authoring and Presentation with Timer and Randomizer

Publication number: 20130054239

Abstract: A system, method and computer program product for authoring and presenting discrete data elements and datasets on any computing device are described. Said datasets can comprise of typed, entered or speech-converted text, numbers, images, and sounds. Said system and method feature a user-controlled timer that can be set in intervals of one or more milliseconds and can be used to display said data elements in said dataset in succession. Another feature described is a randomizer which can present said data elements in said dataset in an unpredictable and random order.

Type: Application

Filed: August 20, 2012

Publication date: February 28, 2013

Inventor: Benjamin Z. Levy
APPARATUS AND METHOD FOR RECOGNIZING VOICE BY USING LIP IMAGE

Publication number: 20130054240

Abstract: An apparatus and a method for recognizing a voice by using a lip image are provided. The apparatus includes: a voice recognizer which recognizes a voice of a user and outputs text information based on the recognized voice; a lip shape detector which detects a lip shape of the user; and a voice recognition result verifier which determines whether the text information output by the voice recognizer is correct, by using a result of the detection by the lip shape detector.

Type: Application

Filed: August 27, 2012

Publication date: February 28, 2013

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jong-hyuk JANG, Hee-seob RYU, Kyung-mi PARK, Seung-kwon PARK, Jae-hyun BAE
RAPID TRANSCRIPTION BY DISPERSING SEGMENTS OF SOURCE MATERIAL TO A PLURALITY OF TRANSCRIBING STATIONS

Publication number: 20130054241

Abstract: A method and system for producing and working with transcripts according to the invention eliminates time inefficiencies. By dispersing a source recording to a transcription team in small segments, so that team members transcribe segments in parallel, a rapid transcription process delivers a fully edited transcript within minutes. Clients can view accurate, grammatically correct, proofread and fact-checked documents that shadow live proceedings by mere minutes. The rapid transcript includes time coding, speaker identification and summary. A viewer application allows a client to view a video recording side-by-side with a transcript. Clicking on a word in the transcript locates the corresponding recorded content; advancing a recording to a particular point locates and displays the corresponding spot in the transcript. The recording is viewed using common video features, and may be downloaded. The client can edit the transcript and insert comments. Any number of colleagues can view and edit simultaneously.

Type: Application

Filed: October 30, 2012

Publication date: February 28, 2013

Inventor: Adam Michael GOLDBERG
Systems and Methods for Providing an Electronic Dictation Interface

Publication number: 20130046537

Abstract: Some embodiments disclosed herein store a target application and a dictation application. The target application may be configured to receive input from a user. The dictation application interface may include a full overlay mode option, where in response to selection of the full overlay mode option, the dictation application interface is automatically sized and positioned over the target application interface to fully cover a text area of the target application interface to appear as if the dictation application interface is part of the target application interface. The dictation application may be further configured to receive an audio dictation from the user, convert the audio dictation into text, provide the text in the dictation application interface and in response to receiving a first user command to complete the dictation, automatically copy the text from the dictation application interface and inserting the text into the target application interface.

Type: Application

Filed: August 19, 2011

Publication date: February 21, 2013

Applicant: DOLBEY & COMPANY, INC.

Inventors: Curtis A. Weeks, Aaron G. Weeks, Stephen E. Barton
SYSTEM AND METHOD OF CONTROLLING SERVICES ON A DEVICE USING VOICE DATA

Publication number: 20130041662

Abstract: A device and method to control applications using voice data. In one embodiment, a method includes detecting voice data from a user, converting the voice data to text data, matching the text data to an identifier the identifier associated with a list of identifiers for controlling operation of the application, and controlling the application based on the identifier matched with the text data. In another embodiment, voice data may be received from a control device.

Type: Application

Filed: August 8, 2011

Publication date: February 14, 2013

Inventor: Sriram Sampathkumaran
AUDIO COMMUNICATION ASSESSMENT

Publication number: 20130041661

Abstract: A device may include a communication interface configured to receive audio signals associated with audible communications from a user; an output device; and logic. The logic may be configured to determine one or more audio qualities associated with the audio signals, map the one or more audio qualities to at least one value, generate audio-related information based on the mapping, and provide, via the output device during the audible communications, the audio-related information to the user.

Type: Application

Filed: August 8, 2011

Publication date: February 14, 2013

Applicants: CELLCO PARTNERSHIP, VERIZON NEW JERSEY INC.

Inventors: Woo Beum Lee, Arvind Basra
COMMUNICATION APPLICATION FOR CONDUCTING CONVERSATIONS INCLUDING MULTIPLE MEDIA TYPES IN EITHER A REAL-TIME MODE OR A TIME-SHIFTED MODE

Publication number: 20130041663

Abstract: A communication application configured to support a conversation among participants over a communication network. The communication application is configured to (i) support one or more media types within the context of the conversation, (ii) interleave the one or more media types in a time-indexed order within the context of the conversation, (iii) enable the participants to render the conversation including the interleaved one or more media types in either a real-time rendering mode or time-shifted rendering mode, and (iv) seamlessly transition the conversation between the two modes so that the conversation may take place substantially live when in the real-time rendering mode or asynchronously when in the time-shifted rendering mode.

Type: Application

Filed: October 12, 2012

Publication date: February 14, 2013

Applicant: VOXER IP LLC

Inventor: VOXER IP LLC
SYSTEM AND METHOD FOR EMERGENCY MESSAGE PREVIEW AND TRANSMISSION

Publication number: 20130041646

Abstract: In accordance with the embodiments of the present invention, a system and method for enabling preview, editing, and transmission of emergency notification messages is provided. The system includes a controller, a microphone, and a speech-to-text engine for receiving an audio message input to the microphone and for convert the audio message to a text message. The resulting text message is displayed on a local display, where a user can edit the message via a text editor. Text and/or audio notification devices are provided for displaying the edited text data as a text message. Other embodiments are disclosed and claimed.

Type: Application

Filed: August 10, 2011

Publication date: February 14, 2013

Applicant: SIMPLEXGRINNELL LP

Inventors: Daniel G. Farley, Matthew Farley, John R. Haynes
System And Method For Efficiently Transcribing Verbal Messages To Text

Publication number: 20130035937

Abstract: A system and method for efficiently transcribing verbal messages to text is provided. Verbal messages are received and at least one of the verbal messages is divided into segments. Automatically recognized text is determined for each of the segments by performing speech recognition and a confidence rating is assigned to the automatically recognized text for each segment. A threshold is applied to the confidence ratings and those segments with confidence ratings that fall below the threshold are identified. The segments that fall below the threshold are assigned to one or more human agents starting with those segments that have the lowest confidence ratings. Transcription from the human agents is received for the segments assigned to that agent. The transcription is assembled with the automatically recognized text of the segments not assigned to the human agents as a text message for the at least one verbal message.

Type: Application

Filed: August 6, 2012

Publication date: February 7, 2013

Inventors: Mike O. Webb, Bruce J. Peterson, Janet S. Kaseda
LANGUAGE TRANSCRIPTION

Publication number: 20130035936

Abstract: A transcription system is applicable to transcription for a language in which there is limited pronunciation and/or acoustic data. A transcription station is configured using pronunciation data and acoustic data for use with the language. The pronunciation data and/or the acoustic data is initially from another dialect of a language, another language from a language group, or is universal (e.g., not specific to any particular language). A partial transcription of the audio recording is accepted via the transcription station (e.g., from a transcriptionist). One or more repetitions of one or more portions of the partial transcription are identified in the audio recording, and can be accepted during transcription. The pronunciation data and/or the acoustic data is updated in a bootstrapping manner during transcription, thereby improving the efficiency of the transcription process.

Type: Application

Filed: August 1, 2012

Publication date: February 7, 2013

Applicant: Nexidia Inc.

Inventors: Jacob B. Garland, Marsal Gavalda
TRANSCRIPTION SUPPORT SYSTEM AND TRANSCRIPTION SUPPORT METHOD

Publication number: 20130030805

Abstract: According to one embodiment, a transcription support system supports transcription work to convert voice data to text. The system includes a first storage unit configured to store therein the voice data; a playback unit configured to play back the voice data; a second storage unit configured to store therein voice indices, each of which associates a character string obtained from a voice recognition process with voice positional information, for which the voice positional information is indicative of a temporal position in the voice data and corresponds to the character string; a text creating unit that creates the text in response to an operation input of a user; and an estimation unit configured to estimate already-transcribed voice positional information indicative of a position at which the creation of the text is completed in the voice data based on the voice indices.

Type: Application

Filed: March 15, 2012

Publication date: January 31, 2013

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Hirokazu Suzuki, Nobuhiro Shimogori, Tomoo Ikeda, Kouji Ueno, Osamu Nishiyama, Manabu Nagao
SYSTEMS AND METHODS FOR IMPROVING THE ACCURACY OF A TRANSCRIPTION USING AUXILIARY DATA SUCH AS PERSONAL DATA

Publication number: 20130030804

Abstract: A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.

Type: Application

Filed: July 26, 2011

Publication date: January 31, 2013

Inventors: George Zavaliagkos, William F. Ganong, III, Uwe H. Jost, Shreedhar Madhavapeddi, Gary B. Clayton
WIRELESS SPEECH RECOGNITION TOOL

Publication number: 20130030807

Abstract: The wireless voice recognition system for data retrieval comprises a server, a database and an input/output device, operably connected to the server. When the user speaks, the voice transmission is converted into a data stream using a specialized user interface. The input/output device and the server exchange the data stream. The server uses a programming interface having an engine to match and compare the stream of audible data to a data element of selected searchable information. A data element of recognized information is generated and transferred to the input/output device for user verification.

Type: Application

Filed: September 28, 2012

Publication date: January 31, 2013

Inventors: Stephen S. Burns, Mickey W. Kowitz, Michael F. Bell
TRANSCRIPTION SUPPORT SYSTEM AND TRANSCRIPTION SUPPORT METHOD

Publication number: 20130030806

Abstract: In an embodiment, a transcription support system includes: a first storage, a playback unit, a second storage, a text generating unit, an estimating unit, and a setting unit. The first storage stores the voice data therein; a playback unit plays back the voice data; and a second storage stores voice indices, each of which associates a character string obtained from a voice recognition process with voice positional information, for which the voice positional information is indicative of a temporal position in the voice data and corresponds to the character string. The text creating unit creates text; the estimating unit estimates already-transcribed voice positional information based on the voice indices; and the setting unit sets a playback starting position that indicates a position at which playback is started in the voice data based on the already-transcribed voice positional information.

Type: Application

Filed: March 15, 2012

Publication date: January 31, 2013

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Kouji Ueno, Nobuhiro Shimogori, Tomoo Ikeda, Osamu Nisiyama, Hirokazu Suzuki, Manabu Nagao
CORRECTIVE FEEDBACK LOOP FOR AUTOMATED SPEECH RECOGNITION

Publication number: 20130024195

Abstract: A method for facilitating the updating of a language model includes receiving, at a client device, via a microphone, an audio message corresponding to speech of a user; communicating the audio message to a first remote server; receiving, that the client device, a result, transcribed at the first remote server using an automatic speech recognition system (“ASR”), from the audio message; receiving, at the client device from the user, an affirmation of the result; storing, at the client device, the result in association with an identifier corresponding to the audio message; and communicating, to a second remote server, the stored result together with the identifier.

Type: Application

Filed: September 15, 2012

Publication date: January 24, 2013

Inventors: Marc White, Igor Roditis Jablokov, Victor Roditis Jablokov
CONTINUOUS SPEECH TRANSCRIPTION PERFORMANCE INDICATION

Publication number: 20130018655

Abstract: A method of providing speech transcription performance indication includes receiving, at a user device data representing text transcribed from an audio stream by an ASR system, and data representing a metric associated with the audio stream; displaying, via the user device, said text; and via the user device, providing, in user-perceptible form, an indicator of said metric. Another method includes displaying, by a user device, text transcribed from an audio stream by an ASR system; and via the user device, providing, in user-perceptible form, an indicator of a level of background noise of the audio stream. Another method includes receiving data representing an audio stream; converting said data representing an audio stream to text via an ASR system; determining a metric associated with the audio stream; transmitting data representing said text to a user device; and transmitting data representing said metric to the user device.

Type: Application

Filed: September 15, 2012

Publication date: January 17, 2013

Inventors: James Richard Terrell, II, Marc White, Igor Roditls Jablokov
FILTERING TRANSCRIPTIONS OF UTTERANCES

Publication number: 20130018656

Abstract: A method for facilitating mobile phone messaging, such as text messaging and instant messaging, includes receiving audio data communicated from the mobile communication device, the audio data representing an utterance that is intended to be at least a portion of the text of the message that is to be sent from the mobile phone to a recipient; transcribing the utterance to text based on the received audio data to generate a transcription; and applying a filter to the transcribed text to generate a filtered transcription, the text of which is intended to mimic language patterns of mobile device messaging that is performed manually by users. The method may also be applied to the audio data of a voicemail, with the filtered, transcribed text being communicated to a mobile phone as, for example, an SMS text message.

Type: Application

Filed: September 15, 2012

Publication date: January 17, 2013

Inventors: Marc White, Cliff Strohofer
DIFFERENTIAL DYNAMIC CONTENT DELIVERY WITH TEXT DISPLAY IN DEPENDENCE UPON SIMULTANEOUS SPEECH

Publication number: 20130013307

Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.

Type: Application

Filed: September 14, 2012

Publication date: January 10, 2013

Applicant: Nuance Communications, Inc.

Inventors: William Kress Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Daniel Mark Schumacher, Thomas J. Watson
METHOD AND SUBSYSTEM FOR SEARCHING MEDIA CONTENT WITHIN A CONTENT-SEARCH SERVICE SYSTEM

Publication number: 20130013305

Abstract: Various embodiments of the present invention include concept-service components of content-search-service systems which employ ontologies and vocabularies prepared for particular categories of content at particular times in order to score transcripts prepared from content items to enable a search-service component of a content-search-service system to assign estimates of the relatedness of portions of a content item to search criteria in order to render search results to clients of the content-search-service system. The concept-service component processes a search request to generate lists of related terms, and then employs the lists of related terms to process transcripts in order to score transcripts based on information contained in the ontologies.

Type: Application

Filed: June 15, 2012

Publication date: January 10, 2013

Applicant: Limelight Networks, Inc.

Inventors: Jonathan Thompson, Vijay Chemburkar, David Bargeron, Soam Acharya
VOICE-BASED TELECOMMUNICATION LOGIN

Publication number: 20130006626

Abstract: A voice-based telecommunications login system which includes a login process controller; a speech recognition module; a speaker verification module; a speech synthesis module; and a user database. Responsive to a user-provided first verbal answer to a first verbal question, the first verbal answer is converted to text and compared with data previously stored in the user database. The speech synthesis module provides a second question to the user, and responsive to a user-provided second verbal answer to the second question, the speaker verification module compares the second verbal answer with a voice print of the user previously stored in the user database and validates that the second verbal answer matches a voice print of the user previously stored in the user database. Also disclosed is a method of logging in to the telecommunications system and a computer program product for logging in to the telecommunications system.

Type: Application

Filed: June 29, 2011

Publication date: January 3, 2013

Applicant: International Business Machines Corporation

Inventors: Chandrasekara Aiyer, Brent W. Bennet, Elizabeth J. Carey, Chuanfeng Li, Faisal Mansoor, Duncan E. Russell, Aditi Sharma
EXTENDED VIDEOLENS MEDIA ENGINE FOR AUDIO RECOGNITION

Publication number: 20130006625

Abstract: A system, method, and computer program product for automatically analyzing multimedia data audio content are disclosed. Embodiments receive multimedia data, detect portions having specified audio features, and output a corresponding subset of the multimedia data and generated metadata. Audio content features including voices, non-voice sounds, and closed captioning, from downloaded or streaming movies or video clips are identified as a human probably would do, but in essentially real time. Particular speakers and the most meaningful content sounds and words and corresponding time-stamps are recognized via database comparison, and may be presented in order of match probability. Embodiments responsively pre-fetch related data, recognize locations, and provide related advertisements. The content features may be also sent to search engines so that further related content may be identified. User feedback and verification may improve the embodiments over time.

Type: Application

Filed: June 28, 2011

Publication date: January 3, 2013

Applicant: Sony Corporation

Inventors: Priyan Gunatilake, Djung Nguyen, Abhishek Patil, Dipendu Saha
GENERATING REPRESENTATIONS OF GROUP INTERACTIONS

Publication number: 20130006628

Abstract: A transcript of a group interaction is generated from audio source data representing the group interaction. The transcript includes a sequence of lines of text, each line corresponding to an audible utterance in the audio source data. A conversation path is generated from the transcript by labeling each transcript line with an identifier identifying the speaker of the corresponding utterance in the audio source data. A representation of the group interaction is generated by associating the conversation path with a set of voice profiles, each voice profile corresponding to an identified speaker in the conversation path.

Type: Application

Filed: September 13, 2012

Publication date: January 3, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Anand Krishnaswamy, Rajeev Palanki
Method and System for Communicating Between a Sender and a Recipient Via a Personalized Message Including an Audio Clip Extracted from a Pre-Existing Recording

Publication number: 20130006627

Abstract: A method of communicating between a sender and a recipient via a personalized message is disclosed comprising: (a) identifying text, via the user interface of a communication device, of a desired lyric phrase from within a pre-existing audio recording; (b) extracting audio substantially associated with the desired lyric phrase from the pre-existing recording into a desired audio clip; (c) inputting personalized text via the user interface; (d) creating the personalized message with the sender identification, the personalized text and access to the desired audio clip; (e) sending an electronic message to the electronic address of the recipient, wherein the electronic message may be an SMS/EMS/MMS message, instant message or email message including a link to the personalized message or an EMS/MMS or email message including the personalized message. An associated method of earning money from the communication along with associated systems are also disclosed.

Type: Application

Filed: January 23, 2012

Publication date: January 3, 2013

Applicant: Rednote LLC

Inventors: Scott Guthery, Richard van den Bosch
Detecting and Communicating Biometrics of Recorded Voice During Transcription Process

Publication number: 20120330660

Abstract: A method and system for determining and communicating biometrics of a recorded speaker in a voice transcription process. An interactive voice response system receives a request from a user for a transcription of a voice file. A profile associated with the requesting user is obtained, wherein the profile comprises biometric parameters and preferences defined by the user. The requested voice file is analyzed for biometric elements according to the parameters specified in the user's profile. Responsive to detecting biometric elements in the voice file that conform to the parameters specified in the user's profile, a transcription output of the voice file is modified according to the preferences specified in the user's profile for the detected biometric elements to form a modified transcription output file. The modified transcription output file may then be provided to the requesting user.

Type: Application

Filed: September 5, 2012

Publication date: December 27, 2012

Applicant: International Business Machines Corporation

Inventor: Peeyush Jaiswal
Electronic Devices with Voice Command and Contextual Data Processing Capabilities

Publication number: 20120330661

Abstract: An electronic device may capture a voice command from a user. The electronic device may store contextual information about the state of the electronic device when the voice command is received. The electronic device may transmit the voice command and the contextual information to computing equipment such as a desktop computer or a remote server. The computing equipment may perform a speech recognition operation on the voice command and may process the contextual information. The computing equipment may respond to the voice command. The computing equipment may also transmit information to the electronic device that allows the electronic device to respond to the voice command.

Type: Application

Filed: September 5, 2012

Publication date: December 27, 2012

Inventor: Aram M. Lindahl
SYSTEMS AND METHODS TO PRESENT VOICE MESSAGE INFORMATION TO A USER OF A COMPUTING DEVICE

Publication number: 20120330658

Abstract: Systems and methods to process and/or present information relating to voice messages for a user that are received from other persons. In one embodiment, a method implemented in a data processing system includes: receiving first data associated with prior communications or activities for a first user on a mobile device; receiving a voice message for the first user; transcribing the voice message using the first data to provide a transcribed message; and sending the transcribed message to the mobile device for display to the user.

Type: Application

Filed: June 20, 2012

Publication date: December 27, 2012

Applicant: XOBNI, INC.

Inventor: Jeffrey Bonforte
Document Extension in Dictation-Based Document Generation Workflow

Publication number: 20120323572

Abstract: An automatic speech recognizer is used to produce a structured document representing the contents of human speech. A best practice is applied to the structured document to produce a conclusion, such as a conclusion that required information is missing from the structured document. Content is inserted into the structured document based on the conclusion, thereby producing a modified document. The inserted content may be obtained by prompting a human user for the content and receiving input representing the content from the human user.

Type: Application

Filed: June 19, 2012

Publication date: December 20, 2012

Inventors: Detlef Koll, Juergen Fritsch, Michael Finke

prev 1 2 3 4 5 6 … next