Speech To Text Systems (epo) Patents (Class 704/E15.043)

E Subclasses

Speech recognition depending on application context, e.g., in a computer, etc. (epo) (Class 704/E15.044)

Systems using speech recognizers (epo) (Class 704/E15.045)

METHOD AND APPARATUS FOR INFORMATION EXTRACTION FROM INTERACTIONS

Publication number: 20120209606

Abstract: Obtaining information from audio interactions associated with an organization. The information may comprise entities, relations or events. The method comprises: receiving a corpus comprising audio interactions; performing audio analysis on audio interactions of the corpus to obtain text documents; performing linguistic analysis of the text documents; matching the text documents with one or more rules to obtain one or more matches; and unifying or filtering the matches.

Type: Application

Filed: February 14, 2011

Publication date: August 16, 2012

Applicant: Nice Systems Ltd.

Inventors: Maya Gorodetsky, Ezra Daya, Oren Pereg
METHOD AND APPARATUS FOR SCROLLING TEXT DISPLAY OF VOICE CALL OR MESSAGE DURING VIDEO DISPLAY SESSION

Publication number: 20120209607

Abstract: A method and communication device disclosed includes displaying a video on a display, converting voice audio data to textual data by applying voice-to-text conversion, and displaying the textual data as scrolling text displayed along with the video on the display and either above, below or across the video. The method may further include receiving a voice call indication from a network, providing the voice call indication to a user interface where the voice call indication corresponds to an incoming voice call; and receiving a user input for receiving the voice call and displaying the voice call as scrolling text. In another embodiment, a method includes displaying application related data on a display; converting voice audio data to textual data by applying voice-to-text conversion; converting the textual data to a video format; and displaying the textual data as scrolling text over the application related data on the display.

Type: Application

Filed: April 13, 2012

Publication date: August 16, 2012

Applicant: QUALCOMM Incorporated

Inventors: Dinesh Kumar Garg, Manish Poddar
AUTOMATED FOLLOW UP FOR E-MEETINGS

Publication number: 20120203551

Abstract: Embodiments of the present invention provide a method, system and computer program product for automated follow-up for e-meetings. In an embodiment of the invention, a method for automated follow-up for e-meetings is provided. The method includes monitoring content provided to an e-meeting managed by an e-meeting server executing in memory of a host computer. The method also includes applying a rule in a rules base to the monitored content. Finally, the method includes triggering generation of a follow up item in response to applying the rule to the monitored content.

Type: Application

Filed: February 4, 2011

Publication date: August 9, 2012

Applicant: International Business Machines Corporation

Inventors: Geetika T. Lakshmanan, Martin Oberhofer
POSTING TO SOCIAL NETWORKS BY VOICE

Publication number: 20120201362

Abstract: Methods, systems, and computer program products are provided for generating and posting messages to social networks based on voice input. One example method includes receiving an audio signal that corresponds to spoken content, generating one or more representations of the spoken content, and causing the one or more representations of the spoken content to be posted to a social network.

Type: Application

Filed: February 3, 2012

Publication date: August 9, 2012

Applicant: GOOGLE INC.

Inventors: Steve Crossan, Ujjwal Singh
CONTROLLING A SET-TOP BOX VIA REMOTE SPEECH RECOGNITION

Publication number: 20120203552

Abstract: A device may receive over a network a digitized speech signal from a remote control that accepts speech. In addition, the device may convert the digitized speech signal into text, use the text to obtain command information applicable to a set-top box, and send the command information to the set-top box to control presentation of multimedia content on a television in accordance with the command information.

Type: Application

Filed: April 16, 2012

Publication date: August 9, 2012

Applicant: VERIZON DATA SERVICES INDIA PVT. LTD.

Inventors: Ashutosh K. Sureka, Sathish K. Subramanian, Sidhartha Basu, Indivar Verma
CALENDAR SHARING FOR THE VEHICLE ENVIRONMENT USING A CONNECTED CELL PHONE

Publication number: 20120197523

Abstract: A mobile device communicates with an in-vehicle system to provide a network-based calendar and related features for viewing and/or editing within a vehicle. The mobile device executes a specialized application that retrieves calendar data from one or more calendar sources in a native calendar format, and converts the calendar data to a customized vehicle format designed specifically for convenient transfer and viewing within the vehicle. The user may record spoken voice notes that can be processed to automatically create new calendar entries. An alert feature schedules visual and/or audio alerts to notify the user in advance of scheduled calendar events. When a scheduled calendar event time is reached, the in-vehicle system may automatically place a call to an event invitee or generating a route to an event destination.

Type: Application

Filed: January 27, 2011

Publication date: August 2, 2012

Applicant: HONDA MOTOR CO., LTD.

Inventor: David M. Kirsch
System and Method for Unsupervised and Active Learning for Automatic Speech Recognition

Publication number: 20120197640

Abstract: A system and method is provided for combining active and unsupervised learning for automatic speech recognition. This process enables a reduction in the amount of human supervision required for training acoustic and language models and an increase in the performance given the transcribed and un-transcribed data.

Type: Application

Filed: April 9, 2012

Publication date: August 2, 2012

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Dilek Zeynep Hakkani-Tür, Giuseppe Riccardi
STORAGE AND ACCESS OF DIGITAL CONTENT

Publication number: 20120191451

Abstract: In one embodiment, the invention provides a method, comprising providing a first communications channel to transmit digital content to a notes-access application for storage against a particular user, the first communications channel being selected from the group consisting of an SMS channel, an MMS channel, a fax channel, an e-mail channel, and an IM channel; responsive to receiving digital content from said user via the first communications channel storing said digital content in the database associated with said notes-access application; and providing a second communications channel to the notes-access application whereby the digital content stored by the notes-access application against said user is provided to said user, the second communications channel being selected from the group consisting of an SMS channel, an MMS channel, a fax channel, an e-mail channel, and an IM channel.

Type: Application

Filed: March 29, 2012

Publication date: July 26, 2012

Inventor: Yue Fang
REPRESENTING GROUP INTERACTIONS

Publication number: 20120191452

Abstract: Disclosed is a system for generating a representation of a group interaction, the system comprising: a transcription module adapted to generate a transcript of the group interaction from audio source data representing the group interaction, the transcript comprising a sequence of lines of text, each line corresponding to an audible utterance in the audio source data; and a labeling module adapted to generate a conversation path from the transcript by labeling each transcript line with an identifier identifying the speaker of the corresponding utterance in the audio source data; and generate the representation of the group interaction by associating the conversation path with a plurality of voice profiles, each voice profile corresponding to an identified speaker in the conversation path.

Type: Application

Filed: April 4, 2012

Publication date: July 26, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Anand Krishnaswamy, Rajeev Palanki
SYSTEM AND METHOD FOR GENERATING AND SENDING A SIMPLIFIED MESSAGE USING SPEECH RECOGNITION

Publication number: 20120185240

Abstract: An embodiment provides a system and method for generating and sending a simplified message using speech recognition. The system provides a speech recognition software that may be utilized for receiving audio, converting audio to text derived from audio, comparing text derived from audio to match fields to find matches, replacing matched text with contents of replacement fields associated to the match fields, generating an output message incorporating the replacement text into the text derived from audio, transmitting the output message to a messaging system and redistributing the output message to recipients.

Type: Application

Filed: January 13, 2012

Publication date: July 19, 2012

Inventors: Michael D. Goller, Stuart E. Goller
Distributed Dictation/Transcription System

Publication number: 20120185250

Abstract: A distributed dictation/transcription system is provided. The system provides a client station, dictation manager, and dictation server networked such that the dictation manager selects a dictation server to transcribe audio from the client station. The dictation manager selects one of a plurality of dictation servers based on conventional load balancing and on a determination of whether the user profile is already uploaded to a dictation server. While selecting a dictation server or uploading a profile, the client may begin dictating, which audio would be stored in a buffer of dictation manager until a dictation server was selected or available. The user may receive in real time or near real time a display of the textual data that may be corrected by the user to update the user profile.

Type: Application

Filed: March 16, 2012

Publication date: July 19, 2012

Applicant: NVOQ INCORPORATED

Inventors: Richard Beach, Christopher Butler, Jon Ford, Brian Marquette, Christopher Omland
METHOD AND SYSTEM FOR SPEECH BASED DOCUMENT HISTORY TRACKING

Publication number: 20120185249

Abstract: A method and a system of history tracking corrections in a speech based document. The speech based document comprises one or more sections of text recognized or transcribed from sections of speech, wherein the sections of speech are dictated by a user and processed by a speech recognizer in a speech recognition system into corresponding sections of text of the speech based document. The method comprises associating at least one speech attribute to each section of text in the speech based document, said speech attribute comprising information related to said section of text, respectively; presenting said speech based document on a presenting unit; detecting an action being performed within any of said sections of text; and updating information of said speech attributes related to the kind of action detected on one of said sections of text for updating said speech based document.

Type: Application

Filed: February 3, 2012

Publication date: July 19, 2012

Applicant: Nuance Communications Austria GMBH

Inventors: Gerhard Grobauer, Miklos Papai
REAL TIME GENERATION OF AUDIO CONTENT SUMMARIES

Publication number: 20120179465

Abstract: Audio content is converted to text using speech recognition software. The text is then associated with a distinct voice or a generic placeholder label if no distinction can be made. From the text and voice information, a word cloud is generated based on key words and key speakers. A visualization of the cloud displays as it is being created. Words grow in size in relation to their dominance. When it is determined that the predominant words or speakers have changed, the word cloud is complete. That word cloud continues to be displayed statically and a new word cloud display begins based upon a new set of predominant words or a new predominant speaker or set of speakers. This process may continue until the meeting is concluded. At the end of the meeting, the completed visualization may be saved to a storage device, sent to selected individuals, removed, or any combination of the preceding.

Type: Application

Filed: January 10, 2011

Publication date: July 12, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Susan Marie Cox, Janani Janakiraman, Fang Lu, Loulwa Salem
SPEECH TO TEXT CONVERTING DEVICE AND METHOD

Publication number: 20120179466

Abstract: A speech to text converting device includes a display, a voice receiving module, a voice recognition module, an identity recognition module, and a control module. The voice receiving module receives a voice signal. The voice recognition module converts the voice signal to voice data and produces text data corresponding to the voice data. The identity recognition module receives the voice signal and establishes an identity data corresponding to the voice signal. The control module displays the text data and the identity data together on the display.

Type: Application

Filed: August 8, 2011

Publication date: July 12, 2012

Applicant: HON HAI PRECISION INDUSTRY CO., LTD.

Inventors: YUAN-FU HUANG, TIEN-PING LIU, CHIEN-HUANG CHANG
METHOD FOR CONTROLLING A MOBILE COMMUNICATIONS DEVICE WHILE LOCATED IN A MOBILE VEHICLE

Publication number: 20120172012

Abstract: A method for controlling a mobile communications device while located in a mobile vehicle involves pairing the mobile communications device with a telematics unit via short range wireless communication. The method further involves, receiving an incoming text message at the mobile device while the mobile device is paired with the telematics unit. Upon receiving the text message, a text messaging management strategy is implemented via the telematics unit and/or the mobile device, where the text messaging management strategy is executable via an application that is resident on the mobile device.

Type: Application

Filed: January 4, 2011

Publication date: July 5, 2012

Applicant: GENERAL MOTORS LLC

Inventors: Anthony J. Sumcad, Shawn F. Granda, Lawrence D. Cepuran, Steven Swanson
SPEECH TO TEXT CONVERTING DEVICE AND METHOD

Publication number: 20120173236

Abstract: A speech to text converting device includes a display, a voice receiving module, and a voice recognition module, an input module, and a control module. The voice receiving module receives a speech within a certain period of time. The voice recognition module converts the speech to voice data. The control module establishes text data corresponding to the voice data and displays the text data, any inputted words, and the relevant time period.

Type: Application

Filed: August 8, 2011

Publication date: July 5, 2012

Applicant: HON HAI PRECISION INDUSTRY CO., LTD.

Inventors: YUAN-FU HUANG, TIEN-PING LIU, CHIEN-HUANG CHANG
Offline Generation of Subtitles

Publication number: 20120173235

Abstract: One embodiment described herein may take the form of a system or method for generating subtitles (also known as “closed captioning”) of an audio component of a multimedia presentation automatically for one or more stored presentations. In general, the system or method may access one or more multimedia programs stored on a storage medium, either as an entire program or in portions. Upon retrieval, the system or method may perform an analysis of the audio component of the program and generate a subtitle text file that corresponds to the audio component. In one embodiment, the system or method may perform a speech recognition analysis on the audio component to generate the subtitle text file.

Type: Application

Filed: December 31, 2010

Publication date: July 5, 2012

Applicant: Eldon Technology Limited

Inventor: Dale Llewelyn Mountain
PROVIDING TEXT INPUT USING SPEECH DATA AND NON-SPEECH DATA

Publication number: 20120166192

Abstract: Systems, methods, and computer readable media providing a speech input interface. The interface can receive speech input and non-speech input from a user through a user interface. The speech input can be converted to text data and the text data can be combined with the non-speech input for presentation to a user.

Type: Application

Filed: November 18, 2011

Publication date: June 28, 2012

Applicant: APPLE INC.

Inventor: Kazuhisa Yanagihara
ELECTRONIC BOOK WITH VOICE EMULATION FEATURES

Publication number: 20120166191

Abstract: A method and system for providing text-to-audio conversion of an electronic book displayed on a viewer. A user selects a portion of displayed text and converts it into audio. The text-to-audio conversion may be performed via a header file and pre-recorded audio for each electronic book, via text-to-speech conversion, or other available means. The user may select manual or automatic text-to audio conversion. The automatic text-to-audio conversion may be performed by automatically turning the pages of the electronic book or by the user manually turning the pages. The user may also select to convert the entire electronic book, or portions of it, into audio. The user may also select an option to receive an audio definition of a particular word in the electronic book. The present invention allows a user to control the system by selecting options from a screen or by entering voice commands.

Type: Application

Filed: November 17, 2011

Publication date: June 28, 2012

Applicant: ADREA LLC

Inventors: John S. Hendricks, Michael L. Asmussen
METHOD AND SYSTEM FOR AUTOMATIC TRANSCRIPTION PRIORITIZATION

Publication number: 20120166193

Abstract: A visual toolkit for prioritizing speech transcription is provided. The toolkit can include a logger (102) for capturing information from a speech recognition system, a processor (104) for determining an accuracy rating of the information, and a visual display (106) for categorizing the information and prioritizing a transcription of the information based on the accuracy rating. The prioritizing identifies spoken utterances having a transcription priority in view of the recognized result. The visual display can include a transcription category (156) having a modifiable textbox entry with a text entry initially corresponding to a text of the recognized result, and an accept button (157) for validating a transcription of the recognized result. The categories can be automatically ranked by the accuracy rating in an ordered priority for increasing an efficiency of transcription.

Type: Application

Filed: January 19, 2012

Publication date: June 28, 2012

Applicant: Nuance Communications, Inc.

Inventors: Jeffrey S. Kobal, Girish Dhanakshirur
SYNCHRONISE AN AUDIO CURSOR AND A TEXT CURSOR DURING EDITING

Publication number: 20120158405

Abstract: A speech recognition device (1) processes speech data (SD) of a dictation and establishes recognized text information (ETI) and link information (LI) of the dictation. In a synchronous playback mode of the speech recognition device (1), during acoustic playback of the dictation a correction device (10) synchronously marks the word of the recognized text information (ETI) which word relates to speech data (SD) just played back marked by link information (LI) is marked synchronously, the just marked word featuring the position of an audio cursor (AC). When a user of the speech recognition device (1) recognizes an incorrect word, he positions a text cursor (TC) at the incorrect word and corrects it. Cursor synchronization means (15) makes it possible to synchronize text cursor (TC) with audio cursor (AC) or audio cursor (AC) with text cursor (TC) so the positioning of the respective cursor (AC, TC) is simplified considerably.

Type: Application

Filed: February 13, 2012

Publication date: June 21, 2012

Applicant: Nuance Communications Austria GmbH

Inventor: Wolfgang Gschwendtner
FILTERING CONFIDENTIAL INFORMATION IN VOICE AND IMAGE DATA

Publication number: 20120150537

Abstract: Confidential information included in image and voice data is filtered in an apparatus that includes an extraction unit for extracting a character string from an image frame, and a conversion unit for converting audio data to a character string. The apparatus also includes a determination unit for determining, in response to contents of a database, whether at least one of the image frame and the audio data include confidential information. The apparatus also includes a masking unit for concealing contents of the image frame by masking the image frame in response to determining that the image frame includes confidential information, and for making the audio data inaudible by masking the audio data in response to determining that the audio data includes confidential information. The playback unit included in the apparatus is for playing back the image frame and the audio data.

Type: Application

Filed: December 5, 2011

Publication date: June 14, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Seiji Abe, Mitsuru Shioya, Shigeki Takeuchi, Daisuke Tomoda
CONFERENCE TRANSCRIPTION BASED ON CONFERENCE DATA

Publication number: 20120143605

Abstract: In one implementation, a collaboration server is a conference bridge or other network device configured to host an audio and/or video conference among a plurality of conference participants. The collaboration server sends conference data and a media stream including speech to a speech recognition engine. The conference data may include the conference roster or text extracted from documents or other files shared in the conference. The speech recognition engine updates a default language model according to the conference data and transcribes the speech in the media stream based on the updated language model. In one example, the performance of default language model, the updated language model, or both may be tested using a confidence interval or submitted for approval of the conference participant.

Type: Application

Filed: December 1, 2010

Publication date: June 7, 2012

Applicant: Cisco Technology, Inc.

Inventors: Tyrone Terry Thorsen, Alan Darryl Gatzke
MULTIMODAL DISAMBIGUATION OF SPEECH RECOGNITION

Publication number: 20120143607

Abstract: The present invention provides a speech recognition system combined with one or more alternate input modalities to ensure efficient and accurate text input. The speech recognition system achieves less than perfect accuracy due to limited processing power, environmental noise, and/or natural variations in speaking style. The alternate input modalities use disambiguation or recognition engines to compensate for reduced keyboards, sloppy input, and/or natural variations in writing style. The ambiguity remaining in the speech recognition process is mostly orthogonal to the ambiguity inherent in the alternate input modality, such that the combination of the two modalities resolves the recognition errors efficiently and accurately. The invention is especially well suited for mobile devices with limited space for keyboards or touch-screen input.

Type: Application

Filed: December 6, 2011

Publication date: June 7, 2012

Inventors: Michael LONGÉ, Richard Eyraud, Keith C. Hullfish
SYSTEM AND METHOD FOR GENERATING CHALLENGE UTTERANCES FOR SPEAKER VERIFICATION

Publication number: 20120130714

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media relating to speaker verification. In one aspect, a system receives a first user identity from a second user, and, based on the identity, accesses voice characteristics. The system randomly generates a challenge sentence according to a rule and/or grammar, based on the voice characteristics, and prompts the second user to speak the challenge sentence. The system verifies that the second user is the first user if the spoken challenge sentence matches the voice characteristics. In an enrollment aspect, the system constructs an enrollment phrase that covers a minimum threshold of unique speech sounds based on speaker-distinctive phonemes, phoneme clusters, and prosody. Then user utters the enrollment phrase and extracts voice characteristics for the user from the uttered enrollment phrase.

Type: Application

Filed: November 24, 2010

Publication date: May 24, 2012

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Ilija Zeljkovic, Taniya Mishra, Amanda Stent, Ann K. Syrdal, Jay Wilpon
Security Control for SMS and MMS Support Using Unified Messaging System

Publication number: 20120123778

Abstract: A method and apparatus for providing security control of short messaging service (SMS) messages and multimedia messaging service (MMS) messages in a unified messaging (UM) system are disclosed. An SMS or MMS message directed to a recipient mailbox in a UM system is received. It is determined that the recipient mailbox is a secondary mailbox associated with a primary mailbox in the UM system. The message is audited according to an audit policy associated with the recipient mailbox.

Type: Application

Filed: November 11, 2010

Publication date: May 17, 2012

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Mehrad Yasrebi, James Jackson, Cheryl Lockett
METHOD AND SYSTEM FOR GENERATING DOCUMENT USING SPEECH DATA AND IMAGE FORMING APPARATUS INCLUDING THE SYSTEM

Publication number: 20120120446

Abstract: A document generation method and system using speech data, and an image forming apparatus including the document generation system. The method includes setting document editing information including at least one of document form information and sentence pattern information for editing a document when the speech data is generated as the document; converting the speech data into text; and generating the text as the document based on the document editing information.

Type: Application

Filed: November 14, 2011

Publication date: May 17, 2012

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Hyun-sub KIL, Mok-hwa Lim
MOBILE DEVICES, METHODS, AND COMPUTER PROGRAM PRODUCTS FOR ENHANCING SOCIAL INTERACTIONS WITH RELEVANT SOCIAL NETWORKING INFORMATION

Publication number: 20120123779

Abstract: Devices, methods, and computer program products are for facilitating enhanced social interactions using a mobile device. A method for facilitating an enhanced social interaction using a mobile device includes receiving an audio input at the mobile device, determining a salient portion of the audio input, receiving relevant information associated with the salient portion, and presenting the relevant information via the mobile device.

Type: Application

Filed: November 15, 2010

Publication date: May 17, 2012

Inventors: James Pratt, Steven Belz, Marc Sullivan
VIDEOLENS MEDIA ENGINE

Publication number: 20120114233

Abstract: A system, method, and computer program product for automatically analyzing multimedia data are disclosed. Embodiments receive multimedia data, detect portions having specified features, and output a corresponding subset of the multimedia data. Content features from downloaded or streaming movies or video clips are identified as a human probably would do, but in essentially real time. Embodiments then generate an index or menu based on individual consumer preferences. Consumers can peruse the index, or produce customized trailers, or edit and tag content with metadata as desired. The tool can categorize and cluster content by feature, to assemble a library of scenes or scene clusters according to user-selected criteria.

Type: Application

Filed: June 28, 2011

Publication date: May 10, 2012

Applicant: Sony Corporation

Inventor: Priyan Gunatilake
Speech Morphing Communication System

Publication number: 20120109648

Abstract: A communication system is described. The communication system including an automatic speech recognizer configured to receive a speech signal and to convert the speech signal into a text sequence. The communication also including a speech analyzer configured to receive the speech signal. The speech analyzer configured to extract paralinguistic characteristics from the speech signal. In addition, the communication system includes a speech output device coupled with the automatic speech recognizer and the speech analyzer. The speech output device configured to convert the text sequence into an output speech signal based on the extracted paralinguistic characteristics.

Type: Application

Filed: October 30, 2011

Publication date: May 3, 2012

Inventor: Fathy Yassa
Speech Morphing Communication System

Publication number: 20120109628

Abstract: A communication system is described. The communication system including an automatic speech recognizer configured to receive a speech signal and to convert the speech signal into a text sequence. The communication system also including a speech analyzer configured to receive the speech signal. The speech analyzer configured to extract paralinguistic characteristics from the speech signal. In addition, the communication system includes a voice analyzer configured to receive the speech signal. The voice analyzer configured to generate one or more phonemes based on the speech signal. The communication system includes a speech output device coupled with the automatic speech recognizer, the speech analyzer and the voice analyzer. The speech output device configured to convert the text sequence into an output speech signal based on the extracted paralinguistic characteristics and said one or more phonemes.

Type: Application

Filed: October 30, 2011

Publication date: May 3, 2012

Inventor: Fathy Yassa
Speech Morphing Communication System

Publication number: 20120109627

Abstract: A networked communication system is described. The communication system including an automatic speech recognizer configured to receive a speech signal from a client over a network and to convert the speech signal into a text sequence. The communication also including a speech analyzer configured to receive the speech signal. The speech analyzer configured to extract paralinguistic characteristics from the speech signal. In addition, the communication system includes a speech output device coupled with the automatic speech recognizer and the speech analyzer. The speech output device configured to convert the text sequence into an output speech signal based on the extracted paralinguistic characteristics.

Type: Application

Filed: October 30, 2011

Publication date: May 3, 2012

Inventor: Fathy Yassa
Voice output device and voice output program

Patent number: 8165879

Abstract: A voice output device, includes: a compound word voice data storage unit that stores voice data in association with each of compound words which is formed of a plurality of words; a text display unit that displays text containing a plurality of words; a word designation unit that designates any of the words in the text displayed by the text display unit as a designated word based on a user's operation; a compound word detection unit that detects a compound word in which voice data is stored in the compound word voice data storage unit, from among the plurality of words in the text containing the designated word; and a voice output unit that outputs voice data corresponding to the compound word detected by the compound word detection unit as a voice.

Type: Grant

Filed: January 3, 2008

Date of Patent: April 24, 2012

Assignee: Casio Computer Co., Ltd.

Inventors: Takatoshi Abe, Takuro Abe, Takashi Kojo
SYSTEM AND METHOD FOR NEAR REAL-TIME IDENTIFICATION AND DEFINITION QUERY

Publication number: 20120089395

Abstract: A method of operating a communication system includes generating a transcript of at least a portion of a conversation between a plurality of users. The transcript includes a plurality of subsets of characters. The method further includes displaying the transcript on a plurality of communication devices, identifying an occurrence of at least one selected subset of characters from the plurality of subsets of characters, and querying a definition source for at least one definition for the selected subset of characters. The definition for the selected subset of characters is displayed on the plurality of communication devices.

Type: Application

Filed: October 7, 2010

Publication date: April 12, 2012

Applicant: Avaya, Inc.

Inventors: David L. Chavez, Larry J. Hardouin
Visual Display of Semantic Information

Publication number: 20120089394

Abstract: Techniques involving visual display of information related to matching user utterances against graph patterns are described. In one or more implementations, an utterance of a user is obtained that has been indicated as corresponding to a graph pattern through linguistic analysis. The utterance is displayed in a user interface as a representation of the graph pattern.

Type: Application

Filed: October 6, 2010

Publication date: April 12, 2012

Applicant: VirtuOz SA

Inventors: Dan Teodosiu, Elizabeth Ireland Powers, Pierre Serge Vincent LeRoy, Sebastien Jean-Marie Christian Saunier
SYSTEM AND METHOD FOR OPEN SPEECH RECOGNITION

Publication number: 20120084086

Abstract: Disclosed herein are systems, methods and non-transitory computer-readable media for performing speech recognition across different applications or environments without model customization or prior knowledge of the domain of the received speech. The disclosure includes recognizing received speech with a collection of domain-specific speech recognizers, determining a speech recognition confidence for each of the speech recognition outputs, selecting speech recognition candidates based on a respective speech recognition confidence for each speech recognition output, and combining selected speech recognition candidates to generate text based on the combination.

Type: Application

Filed: September 30, 2010

Publication date: April 5, 2012

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Mazin GILBERT, Srinivas Bangalore, Patrick Haffner, Robert Bell
SYSTEMS AND METHODS FOR CONVERTING SPEECH IN MULTIMEDIA CONTENT TO TEXT

Publication number: 20120078626

Abstract: Methods and systems for converting speech to text are disclosed. One method includes analyzing multimedia content to determine the presence of closed captioning data. The method includes, upon detecting closed captioning data, indexing the closed captioning data as associated with the multimedia content. The method also includes, upon failure to detect closed captioning data in the multimedia content, extracting audio data from multimedia content, the audio data including speech data, performing a plurality of speech to text conversions on the speech data to create a plurality of transcripts of the speech data, selecting text from one or more of the plurality of transcripts to form an amalgamated transcript, and indexing the amalgamated transcript as associated with the multimedia content.

Type: Application

Filed: September 27, 2010

Publication date: March 29, 2012

Inventors: Johney Tsai, Matthew Miller, David Strong
MEETING SUPPORT APPARATUS, METHOD AND PROGRAM

Publication number: 20120078629

Abstract: According to one embodiment, a meeting support apparatus includes a storage unit, a determination unit, a generation unit. The storage unit is configured to store storage information for each of words, the storage information indicating a word of the words, pronunciation information on the word, and pronunciation recognition frequency. The determination unit is configured to generate emphasis determination information including an emphasis level that represents whether a first word should be highlighted and represents a degree of highlighting determined in accordance with a pronunciation recognition frequency of a second word when the first word is highlighted, based on whether the storage information includes second set corresponding to first set and based on the pronunciation recognition frequency of the second word when the second set is included. The generation unit is configured to generate an emphasis character string based on the emphasis determination information when the first word is highlighted.

Type: Application

Filed: March 25, 2011

Publication date: March 29, 2012

Inventors: Tomoo Ikeda, Nobuhiro Shimogori, Kouji Ueno
HEAD-MOUNTED TEXT DISPLAY SYSTEM AND METHOD FOR THE HEARING IMPAIRED

Publication number: 20120078628

Abstract: The head-mounted text display system for the hearing impaired is a speech-to-text system, in which spoken words are converted into a visual textual display and displayed to the user in passages containing a selected number of words. The system includes a head-mounted visual display, such as eyeglass-type dual liquid crystal displays or the like, and a controller. The controller includes an audio receiver, such as a microphone or the like, for receiving spoken language and converting the spoken language into electrical signals. The controller further includes a speech-to-text module for converting the electrical signals representative of the spoken language to a textual data signal representative of individual words. A transmitter associated with the controller transmits the textual data signal to a receiver associated with the head-mounted display. The textual data is then displayed to the user in passages containing a selected number of individual words.

Type: Application

Filed: September 28, 2010

Publication date: March 29, 2012

Inventor: MAHMOUD M. GHULMAN
System and Method for Contextual Social Network Communications During Phone Conversation

Publication number: 20120065969

Abstract: An embodiment of the invention includes methods and systems for contextual social network communications during a phone conversation. A telephone conversation between a first user and at least one second user is monitored. More specifically, a monitor identifies terms spoken by the first user and the second user during the telephone conversation. The terms spoken are translated into textual keywords by a translating module. One or more of the second user's web applications are searched by a processor for portion(s) of the second user's web applications that include at least one of the keywords. The processor also searches one or more of the first user's web applications for portion(s) of the first user's web applications that include at least one of the keywords. The portion(s) of the second user's web applications and the portion(s) of the first user's web applications are displayed to the first user during the telephone conversation.

Type: Application

Filed: September 13, 2010

Publication date: March 15, 2012

Applicant: International Business Machines Corporation

Inventors: Lisa Seacat DeLuca, Pamela A. Nesbitt
SYSTEM AND METHOD FOR PROVIDING GROUP DISCUSSIONS

Publication number: 20120065970

Abstract: A system and method for providing a discussion, including receiving by a processor text related to a discussion; converting by the processor the text to voice; storing by the processor in a memory the converted voice; receiving by the processor voice related to the discussion; storing by the processor in the memory the received voice; receiving by the processor a request to play voice related to at least part of the discussion; and transmitting by the processor audio containing the voice identified by the request related to the at least part of the discussion.

Type: Application

Filed: September 15, 2010

Publication date: March 15, 2012

Applicant: Sequent, Inc.

Inventors: Charanjit SINGH, Mukesh Sehgal
METHODS AND SYSTEMS FOR OBTAINING LANGUAGE MODELS FOR TRANSCRIBING COMMUNICATIONS

Publication number: 20120059652

Abstract: A method for transcribing a spoken communication includes acts of receiving a spoken first communication from a first sender to a first recipient, obtaining information relating to a second communication, which is different from the first communication, from a second sender to a second recipient, using the obtained information to obtain a language model, and using the language model to transcribe the spoken first communication.

Type: Application

Filed: August 30, 2011

Publication date: March 8, 2012

Inventors: Jeffrey P. Adams, Kenneth Basye, Ryan Thomas
System and Method for Generating Videoconference Transcriptions

Publication number: 20120053936

Abstract: A method for generating a transcription of a videoconference includes matching human speech of a videoconference to writable symbols. The human speech is encoded in audio data of the videoconference. The writable symbols are parsed into a plurality of statements. For each statement of the plurality of statements, user profile data stored in computer-readable memory is used to determine which participant of a plurality of participants of the videoconference is most likely the source of the statement. A transcription of the videoconference is generated that identifies for each statement the determination of which participant of the plurality of participants of the videoconference is most likely the source of the statement.

Type: Application

Filed: August 31, 2010

Publication date: March 1, 2012

Applicant: Fujitsu Limited

Inventor: David L. Marvit
GENERALIZING TEXT CONTENT SUMMARY FROM SPEECH CONTENT

Publication number: 20120053937

Abstract: A text content summary is created from speech content. A focus more signal is issued by a user while receiving the speech content. The focus more signal is associated with a time window, and the time window is associated with a part of the speech content. It is determined whether to use the part of the speech content associated with the time window to generate a text content summary based on a number of the focus more signals that are associated with the time window. The user may express relative significance to different portions of speech content, so as to generate a personal text content summary.

Type: Application

Filed: August 22, 2011

Publication date: March 1, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: BAO HUA CAO, LE HE, XING JIN, QING BO WANG, XIN ZHOU
Advanced Voicemail Features Without Carrier Voicemail Support

Publication number: 20120053938

Abstract: In one embodiment, a communication request from a remote requester is intercepted at the computing device. Based on the intercepted communication request, one or more voicemail features are enabled at the computing device, independent of carrier voicemail support. The remote requester may be, for example, a caller or a voicemail server, and the intercepted communication request may be a phone call or a voicemail notification, respectively. In another embodiment, a system at a computing device coupled to a network includes a communication request handler and a voicemail manager. The communication request handler intercepts a communication request from a remote requester at the computing device. The intercepted communication request may be a voicemail notification from a network server or a phone call from a caller. The voicemail manager enables one or more voicemail features at the computing device, independent of carrier voicemail support, based on the intercepted communication request.

Type: Application

Filed: September 30, 2011

Publication date: March 1, 2012

Applicant: Google Inc.

Inventor: Jean-Michel Trivi
SYNCHRONIZATION OF AN INPUT TEXT OF A SPEECH WITH A RECORDING OF THE SPEECH

Publication number: 20120041758

Abstract: A method and system for synchronizing words in an input text of a speech with a continuous recording of the speech. A received input text includes previously recorded content of the speech to be reproduced. A synthetic speech corresponding to the received input text is generated. Ratio data including a ratio between the respective pronunciation times of words included in the received text in the generated synthetic speech is computed. The ratio data is used to determine an association between erroneously recognized words of the received text and a time to reproduce each erroneously recognized word. The association is outputted in a recording medium and/or displayed on a display device.

Type: Application

Filed: October 24, 2011

Publication date: February 16, 2012

Applicant: Nuance Communications, Inc.

Inventors: Noriko Imoto, Tetsuya Uda, Takatoshi Watanabe
Population of Lists and Tasks from Captured Voice and Audio Content

Publication number: 20120035925

Abstract: Automatic capture and population of task and list items in an electronic task or list surface via voice or audio input through an audio recording-capable mobile computing device is provided. A voice or audio task or list item may be captured for entry into a task application interface or into a list authoring surface interface for subsequent use as task items, reminders, “to do” items, list items, agenda items, work organization outlines, and the like. Captured voice or audio content may be transcribed locally or remotely, and transcribed content may be populated into a task or list authoring surface user interface that may be displayed on the capturing device (e.g., mobile telephone), or that may be stored remotely and subsequently displayed in association with a number of applications on a number of different computing devices.

Type: Application

Filed: October 12, 2011

Publication date: February 9, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Ned B. Friend, Kanav Arora, Marta Rey-Babarro, David De La Brena Valderrama, Erez Kikin-Gil, Matthew J. Kotler, Charles W. Parker, Maya Rodrig, Igor Zaika
DISAMBIGUATING INPUT BASED ON CONTEXT

Publication number: 20120035924

Abstract: In one implementation, a computer-implemented method includes receiving, at a mobile computing device, ambiguous user input that indicates more than one of a plurality of commands; and determining a current context associated with the mobile computing device that indicates where the mobile computing device is currently located. The method can further include disambiguating the ambiguous user input by selecting a command from the plurality of commands based on the current context associated with the mobile computing device; and causing output associated with performance of the selected command to be provided by the mobile computing device.

Type: Application

Filed: July 20, 2011

Publication date: February 9, 2012

Applicant: GOOGLE INC.

Inventors: John Nicholas JITKOFF, Michael J. LEBEAU
Platform for Enabling Voice Commands to Resolve Phoneme Based Domain Name Registrations

Publication number: 20120035926

Abstract: A method, apparatus, and system are directed towards employing machine representations of phonemes to generate and manage domain names, and/or messaging addresses. A user of a computing device may provide an audio input signal such as obtained from human language sounds. The audio input signal is received at a phoneme encoder that converts the sounds into machine representations of the sounds using a phoneme representation viewable as a sequence of alpha-numeric values. The sequence of alpha-numeric values may then be combined with a host name, or the like to generate a URI, a message address, or the like. The generated URI, message address, or the like, may then be used to communication over a network.

Type: Application

Filed: October 18, 2011

Publication date: February 9, 2012

Applicant: Demand Media, Inc.

Inventor: Christopher J. Ambler
SYSTEM AND METHOD FOR SELECTIVE VOICEMAIL TRANSCRIPTION

Publication number: 20120033794

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for selectively transcribing messages. Five general approaches are disclosed herein. The first approach is directed to checking for a transcription capable client, which transcribes messages when a client device is capable of receiving transcriptions. The second and third approaches are platform-controlled and user-controlled predefined selective transcription. One aspect of this approach is driven by transcription rules. The fourth approach is user-controlled on-demand selective transcription before the message is stored or deposited for transcription. An example of this is a user transferring an incoming caller to voicemail and indicating that the voicemail be transcribed. The fifth approach is user-controlled on-demand selective transcription after the message is stored. In one embodiment of this approach, a user must specifically request that a stored message be transcribed.

Type: Application

Filed: August 6, 2010

Publication date: February 9, 2012

Applicant: AT&T Intellectual Property I, L.P.

Inventors: James JACKSON, Philip Cunetto, Mehrad Yasrebi

prev 1 2 3 4 5 6 7 8 … next