Speech To Text Systems (epo) Patents (Class 704/E15.043)

E Subclasses

Speech recognition depending on application context, e.g., in a computer, etc. (epo) (Class 704/E15.044)

Systems using speech recognizers (epo) (Class 704/E15.045)

RADIOLOGY VERIFICATION SYSTEM AND METHOD

Publication number: 20120316874

Abstract: A system and method of radiology verification is provided. The verification may be implemented as a standalone software utility, as part of a radiology imaging graphical user interface, or within a more complex computing system configured for generating radiology reports.

Type: Application

Filed: April 11, 2012

Publication date: December 13, 2012

Inventor: Brian T. Lipman
HOSTED SPEECH HANDLING

Publication number: 20120316875

Abstract: Embodiments of the invention provide systems and methods for speech signal handling. Speech handling according to one embodiment of the present invention can be performed via a hosted architecture. Electrical signal representing human speech can be analyzed with an Automatic Speech Recognizer (ASR) hosted on a different server from a media server or other server hosting a service utilizing speech input. Neither server need be located at the same location as the user. The spoken sounds can be accepted as input to and handled with a media server which identifies parts of the electrical signal that contain a representation of speech. This architecture can serve any user who has a web-browser and Internet access, either on a PC, PDA, cell phone, tablet, or any other computing device.

Type: Application

Filed: June 8, 2012

Publication date: December 13, 2012

Applicant: Red Shift Company, LLC

Inventors: JOEL NYQUIST, Matthew Robinson
AUTOMATICALLY CREATING A MAPPING BETWEEN TEXT DATA AND AUDIO DATA

Publication number: 20120310642

Abstract: Techniques are provided for creating a mapping that maps locations in audio data (e.g., an audio book) to corresponding locations in text data (e.g., an e-book). Techniques are provided for using a mapping between audio data and text data, whether or not the mapping is created automatically or manually. A mapping may be used for bookmark switching where a bookmark established in one version of a digital work is used to identify a corresponding location with another version of the digital work. Alternatively, the mapping may be used to play audio that corresponds to text selected by a user. Alternatively, the mapping may be used to automatically highlight text in response to audio that corresponds to the text being played. Alternatively, the mapping may be used to determine where an annotation created in one media context (e.g., audio) will be consumed in another media context (e.g., text).

Type: Application

Filed: October 6, 2011

Publication date: December 6, 2012

Applicant: APPLE INC.

Inventors: Xiang Cao, Alan C. Cannistraro, Gregory S. Robbin, Casey M. Dougherty
Inter-language Communication Devices and Methods

Publication number: 20120310622

Abstract: A single device allows two or more users to converse in different languages. The translation device receives inputs from the users which are translated and displayed to the other user in the other user's selected language. In one embodiment, there are two display areas for right side up display of the conversation to each user. In a second embodiment, one display is changed from one language to another as it is passed from one user to the other.

Type: Application

Filed: April 26, 2012

Publication date: December 6, 2012

Applicant: ORTSBO, INC.

Inventors: Aleksandar Zivkovic, Mark Charles Hale, Justin Earl Marek
METHODS AND APPARATUS FOR PROOFING OF A TEXT INPUT

Publication number: 20120310643

Abstract: Techniques for presenting data input as a plurality of data chunks including a first data chunk and a second data chunk. The techniques include converting the plurality of data chunks to a textual representation comprising a plurality of text chunks including a first text chunk corresponding to the first data chunk and a second text chunk corresponding to the second data chunk, respectively, and providing a presentation of at least part of the textual representation such that the first text chunk is presented differently than the second text chunk to, when presented, assist a user in proofing the textual representation.

Type: Application

Filed: May 23, 2012

Publication date: December 6, 2012

Applicant: Nuance Communications, Inc.

Inventors: Martin Labsky, Jan Kleindienst, Tomas Macek, David Nahamoo, Jan Curin, Lars Koenig, Holger Quast
INSERTION OF STANDARD TEXT IN TRANSCRIPTION

Publication number: 20120310644

Abstract: A computer program product, for automatically editing a medical record transcription, resides on a computer-readable medium and includes computer-readable instructions for causing a computer to obtain a first medical transcription of a dictation, the dictation being from medical personnel and concerning a patient, analyze the first medical transcription for presence of a first trigger phrase associated with a first standard text block, determine that the first trigger phrase is present in the first medical transcription if an actual phrase in the first medical transcription corresponds with the first trigger phrase, and insert the first standard text block into the first medical transcription.

Type: Application

Filed: August 13, 2012

Publication date: December 6, 2012

Applicant: eScription Inc.

Inventors: Roger S. Zimmerman, Paul Egerman, Robert G. Titemore, George Zavaliagkos
INTEGRATION OF EMBEDDED AND NETWORK SPEECH RECOGNIZERS

Publication number: 20120310645

Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.

Type: Application

Filed: August 14, 2012

Publication date: December 6, 2012

Applicant: GOOGLE INC.

Inventors: Alexander Gruenstein, William J. Byrne
NUMBER-ASSISTANT VOICE INPUT SYSTEM, NUMBER-ASSISTANT VOICE INPUT METHOD FOR VOICE INPUT SYSTEM AND NUMBER-ASSISTANT VOICE CORRECTING METHOD FOR VOICE INPUT SYSTEM

Publication number: 20120303368

Abstract: The present invention discloses a number-assistant voice input system, a number-assistant voice input method for a voice input system and a number-assistant voice correcting method for a voice input system, which apply software to drive a voice input system of an electronic device to provide a voice input logic circuit module. The voice input logic circuit module defines the pronunciation of numbers 1 to 26 as the paths to respectively input letters A to Z in the voice input system and allows users to selectively input or correct a letter by reading a number from 1 to 26 instead of a letter from A to Z.

Type: Application

Filed: May 27, 2011

Publication date: November 29, 2012

Inventor: Ting MA
MULTI-MODE TEXT INPUT

Publication number: 20120296646

Abstract: Concepts and technologies are described herein for multi-mode text input. In accordance with the concepts and technologies disclosed herein, content is received. The content can include one or more input indicators. The input indicators can indicate that user input can be used in conjunction with consumption or use of the content. The application is configured to analyze the content to determine context associated with the content and/or the client device executing the application. The application also is configured to determine, based upon the content and/or the contextual information, which input device to use to obtain input associated with use or consumption of the content. Input captured with the input device can be converted to text and used during use or consumption of the content.

Type: Application

Filed: May 17, 2011

Publication date: November 22, 2012

Applicant: Microsoft Corporation

Inventors: Mohan Varthakavi, Jayaram Nanduri, Nikhil Kothari
INFORMATION PROCESSING APPARATUS

Publication number: 20120296647

Abstract: In an embodiment, an information processing apparatus includes: a converting unit; a selecting unit; a dividing unit; a generating unit; and a display processing unit. The converting unit recognizes a voice input from a user into a character string. The selecting unit selects characters from the character string according to designation of the user. The dividing unit converts the selected characters into phonetic characters and divides the phonetic characters into phonetic characters of sound units. The generating unit extracts similar character candidates corresponding to each of the divided phonetic characters of the sound units, from a similar character dictionary storing a plurality of phonetic characters of sound units similar in sound as the similar character candidates in association with each other, and generates correction character candidates for the selected characters. The display processing unit makes a display unit display the generated correction character candidates selectable by the user.

Type: Application

Filed: May 23, 2012

Publication date: November 22, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Yuka Kobayashi, Tetsuro Chino, Kazuo Sumita, Hisayoshi Nagae, Satoshi Kamatani
Translating Between Spoken and Written Language

Publication number: 20120290299

Abstract: Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.

Type: Application

Filed: May 13, 2011

Publication date: November 15, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sara H. Basson, Rick Hamilton, Dan Ning Jiang, Dimitri Kanevsky, David Nahamoo, Michael Picheny, Bhuvana Ramabhadran, Tara N. Sainath
Sentence Simplification for Spoken Language Understanding

Publication number: 20120290290

Abstract: Sentence simplification may be provided. A spoken phrase may be received and converted to a text phrase. An intent associated with the text phrase may be identified. The text phrase may then be reformatted according to the identified intent and a task may be performed according to the reformatted text phrase.

Type: Application

Filed: May 12, 2011

Publication date: November 15, 2012

Applicant: Microsoft Corporation

Inventors: Gokhan Tur, Dilek Hakkani-Tur, Larry Paul Heck, Sarangarajan Parthasarathy
SYSTEM AND METHOD FOR OPTIMIZING SPEECH RECOGNITION AND NATURAL LANGUAGE PARAMETERS WITH USER FEEDBACK

Publication number: 20120290298

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Type: Application

Filed: May 9, 2011

Publication date: November 15, 2012

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Andrej LJOLJE, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
APPARATUS AND METHOD FOR FOREIGN LANGUAGE STUDY

Publication number: 20120290300

Abstract: The apparatus for foreign language study includes: a voice recognition device configured to recognize a speech entered by a user and convert the speech into a speech text; a speech intent recognition device configured to extract a user speech intent for the speech text using skill level information of the user and dialogue context information; and a feedback processing device configured to extract a different expression depending on the user speech intent and a speech situation of the user. According to the present invention, the intent of a learner's speech may be determined even though the learner's skill is low, and customized expressions for various situations may be provided to the learner.

Type: Application

Filed: October 15, 2010

Publication date: November 15, 2012

Applicant: POSTECH ACADEMY- INDUSTRY FOUNDATION

Inventors: Sung Jin Lee, Cheong Jae Lee, Gary Geunbae Lee
METHOD AND SYSTEM OF ENABLING INTELLIGENT AND LIGHTWEIGHT SPEECH TO TEXT TRANSCRIPTION THROUGH DISTRIBUTED ENVIRONMENT

Publication number: 20120290301

Abstract: A system includes at least one wireless client device, a service manager, and a plurality of voice transcription servers. The service manager includes a resource management service and a profile management service. The client device communicates the presence of a voice transcription task to the resource management service. The resource management service surveys the plurality of voice transcription servers and selects one voice transcription server based on a set of predefined criteria. The resource management service then communicated an address of the selected server to the profile management service, which then transmits a trained voice profile or default profile to the selected server. The address of the selected server is then sent to the client device, which then transmits an audio stream to the server. Finally, the selected server transcribes the audio stream to a textual format.

Type: Application

Filed: July 30, 2012

Publication date: November 15, 2012

Applicant: Nuance Communications, Inc.

Inventors: Amarjit S. Bahl, Dalia Massoud, Dikran S. Meliksetian, Chen Shu, Michael Van Der Meulen, Nianjun Zhou
Text Interface Device and Method in Voice Communication

Publication number: 20120284024

Abstract: A computerized communication device has a display screen, a mechanism for a user to select words or phrases displayed on the display screen, and software executing from a non-transitory physical medium, the software providing a function for providing audio signal output in a connected voice-telephone call from the text words or phrases selected by a user.

Type: Application

Filed: May 3, 2011

Publication date: November 8, 2012

Inventor: Padmanabhan Mahalingam
COMMUNICATION INTERFACE APPARATUS AND METHOD FOR MULTI-USER AND SYSTEM

Publication number: 20120278066

Abstract: A communication interface apparatus for a system and a plurality of users is provided. The communication interface apparatus for the system and the plurality of users includes a first process unit configured to receive voice information and face information from at least one user, and determine whether the received voice information is voice information of at least one registered user based on user models corresponding to the respective received voice information and face information; a second process unit configured to receive the face information, and determine whether the at least one user's attention is on the system based on the received face information; and a third process unit configured to receive the voice information, analyze the received voice information, and determine whether the received voice information is substantially meaningful to the system based on a dialog model that represents conversation flow on a situation basis.

Type: Application

Filed: November 9, 2010

Publication date: November 1, 2012

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Nam-Hoon Kim, Chi-Youn Park, Jeong-Mi Cho, Jeong-su Kim
TRANSCRIPTION SYSTEM

Publication number: 20120278071

Abstract: A transcription system automates the control of the playback of the audio to accommodate the user's ability to transcribe the words spoken. In some examples, a delay between playback and typed input is estimated by processing the typed words using a wordspotting approach. The estimated delay is used as in input to an automated speed control, for example, to maintain a target or maximum delay between playback and typed input.

Type: Application

Filed: April 29, 2011

Publication date: November 1, 2012

Applicant: Nexidia Inc.

Inventors: Jacob B. Garland, Marsal Gavalda
MULTISENSORY SPEECH DETECTION

Publication number: 20120278074

Abstract: A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters.

Type: Application

Filed: July 10, 2012

Publication date: November 1, 2012

Applicant: Google Inc.

Inventors: Dave Burke, Michael J. Lebeau, Konrad Gianno, Trausti Kristjansson, John Nicholas Jitkoff, Andrew W. Senior
MOBILE SYSTEMS AND METHODS OF SUPPORTING NATURAL LANGUAGE HUMAN-MACHINE INTERACTIONS

Publication number: 20120278073

Abstract: A mobile system is provided that includes speech-based and non-speech-based interfaces for telematics applications. The mobile system identifies and uses context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for users that submit requests and/or commands in multiple domains. The invention creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command. The invention may organize domain specific behavior and information into agents, that are distributable or updateable over a wide area network.

Type: Application

Filed: June 4, 2012

Publication date: November 1, 2012

Applicant: VoiceBox Technologies, Inc.

Inventors: CHRIS WEIDER, Richard Kennewick, Mike Kennewick, Philippe Di Cristo, Robert A. Kennewick, Samuel Menaker, Lynn Elise Armstrong
REMOTE HEALTHCARE SYSTEM AND HEALTHCARE METHOD USING THE SAME

Publication number: 20120278072

Abstract: A remote healthcare system includes a healthcare staff terminal which includes an input part configured to input text to be transmitted to a patient by a healthcare staff member, and a first transmitter-receiver part configured to transmit the text and a qualifier of the healthcare staff member; a server which includes a second transmitter-receiver part configured to receive the text and the qualifier of the healthcare staff member transmitted from the healthcare staff terminal, an acoustic source database having an acoustic source of the healthcare staff member stored therein, and a converter configured to change the text into voice using the stored acoustic source of the healthcare staff member; and a patient terminal which includes a third transmitter-receiver part configured to receive the voice converted from the text and the text transmitted by the second transmitter-receiver part of the server, and an output part configured to output the voice to the patient who is managed by the healthcare staff membe

Type: Application

Filed: March 27, 2012

Publication date: November 1, 2012

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jeong Je PARK, Kwang Hyeon LEE
SYSTEM AND METHOD FOR PROVIDING AUGMENTED DATA IN A NETWORK ENVIRONMENT

Publication number: 20120262533

Abstract: A method is provided in one example and includes identifying a particular word recited by an active speaker in a conference involving a plurality of endpoints in a network environment; evaluating a profile associated with the active speaker in order to identify contextual information associated with the particular word; and providing augmented data associated with the particular word to at least some of the plurality of endpoints. In more specific examples, the active speaker is identified using a facial detection protocol, or a speech recognition protocol. Data from the active speaker can be converted from speech to text.

Type: Application

Filed: April 18, 2011

Publication date: October 18, 2012

Inventors: Satish K. Gannu, Leon A. Frazier, Didier R. Moretti
INTERACTIVE VOICE RECOGNITION ELECTRONIC DEVICE AND METHOD

Publication number: 20120265527

Abstract: An interactive voice recognition electronic device converts a received voice signal to a text, and searches a voice databases to find a matched voice text of the converted text. The matched voice text is taken as a recognized voice text of the voice signal if the matched voice text exists in the voice database. The electronic device obtains a predetermined number of similar voice texts if no matched voice text exists in the voice database. The electronic device converts the predetermined number of similar voice texts to the voice signals, outputs the converted voice signals in turn, and selects one of the similar voice texts as the recognized voice text according to the selection of the user. The electronic device obtains the associated answer text of the recognized voice text in the voice database and converts the answer text to voice signals.

Type: Application

Filed: August 9, 2011

Publication date: October 18, 2012

Applicants: HON HAI PRECISION INDUSTRY CO., LTD., FU TAI HUA INDUSTRY (SHENZHEN) CO., LTD.

Inventors: YU-KAI XIONG, XIN LU, SHIH-FANG WONG, DONG-SHENG LV, XIN-HUA LI, YU-YONG ZHANG, JIAN-JIAN ZHU
SYSTEMS AND METHODS FOR OBTAINING AND DISPLAYING AN X-RAY IMAGE

Publication number: 20120265529

Abstract: A method and system for transcription of spoken language into continuous text for a user comprising the steps of inputting spoken language of at least one user or of a communication partner of the at least one user into a mobile device of the respective user, wherein the input spoken language of the user is transported within a corresponding stream of voice over IP data packets to a transcription server; transforming the spoken language transported within the respective stream of voice over IP data packets into continuous text by means of a speech recognition algorithm run by said transcription server, wherein said speech recognition algorithm is selected depending on a natural language or dialect spoken in the area of the current position of said mobile device; and outputting said transformed continuous text forwarded by said transcription server to said mobile device of the respective user or to a user terminal of the respective user in real time.

Type: Application

Filed: October 27, 2010

Publication date: October 18, 2012

Inventors: Michaela Nachtrab, Robin Ribback
AUDIO-INTERACTIVE MESSAGE EXCHANGE

Publication number: 20120259633

Abstract: A completely hands free exchange of messages, especially in portable devices, is provided through a combination of speech recognition, text-to-speech (TTS), and detection algorithms. An incoming message may be read aloud to a user and the user enabled to respond to the sender with a reply message through audio input upon determining whether the audio interaction mode is proper. Users may also be provided with options for responding in a different communication mode (e.g., a call) or perform other actions. Users may further be enabled to initiate a message exchange using natural language.

Type: Application

Filed: April 7, 2011

Publication date: October 11, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Liane Aihara, Shane Landry, Lisa Stifelman, Madhusudan Chinthakunta, Anne Sullivan, Kathleen Lee
MUSIC PLAYBACK DEVICE, MUSIC PLAYBACK METHOD, PROGRAM, AND DATA CREATION DEVICE

Publication number: 20120259634

Abstract: There is provided a music playback device comprising a playback unit configured to playback music, an analysis unit configured to analyze lyrics of the music and extract a word or a phrase included in the lyrics, an acquisition unit configured to acquire an image using the word or the phrase extracted by the analysis unit, and a display control unit configured to, during playback of the music, cause a display device to display the image acquired by the acquisition unit.

Type: Application

Filed: February 16, 2012

Publication date: October 11, 2012

Applicant: Sony Corporation

Inventor: Motoki TSUNOKAWA
Document Certification and Security System

Publication number: 20120259635

Abstract: A system for the storing of client information in an independent repository is disclosed. Client data may be uploaded by client or those authorized by client or collected and stored by the repository. Data about the client file such as, for example, the time of upload and modifications are stored in a metadata file associated with the client file.

Type: Application

Filed: April 5, 2012

Publication date: October 11, 2012

Inventors: Gregory J. Ekchian, Jack A. Ekchian
CREATING A DETAILED CONTACT RECORD FROM A DIGITAL IMAGE OF A BUSINESS CARD AND ASSOCIATED COMPANY DATA

Publication number: 20120257786

Abstract: There is provided a method that includes (a) receiving image data, (b) processing the image data to yield first data, (c) obtaining second data from a repository, based on the first data, and (d) storing the first data and the second data as a record in a database. There is also provided a method that includes (a) receiving image data, (b) processing the image data to yield first data and second data, (c) matching the first data to a record that is stored in a database, and (d) updating the record to include the second data. There is also provided a system that performs the methods, and a storage medium that contains instructions that control a processor to perform the methods.

Type: Application

Filed: April 5, 2012

Publication date: October 11, 2012

Applicant: THE DUN & BRADSTREET CORPORATION

Inventor: Daniel Scott Camper
METHOD AND APPARATUS FOR PROCESSING SPOKEN SEARCH QUERIES

Publication number: 20120259636

Abstract: Some embodiments relate to a method of performing a search for content on the Internet, in which a user may speak a search query and speech recognition may be performed on the spoken query to generate a text search query to be provided to a plurality of search engines. This enables a user to speak the search query rather than having to type it, and also allows the user to provide the search query only once, rather than having to provide it separately to multiple different search engines.

Type: Application

Filed: June 19, 2012

Publication date: October 11, 2012

Applicant: Nuance Communications, Inc.

Inventors: Vladimir Sejnoha, William F. Ganong, III, Paul J. Vozila, Nathan M. Bodenstab, Yik-Cheung Tam
AUDIO COMMENTING AND PUBLISHING SYSTEM

Publication number: 20120253795

Abstract: An audio commenting and publishing system including a storage database, media content and a computing device all coupled together via a network. The computing device comprises a processor and an application executed by the processor configured to input audio data that a user wishes to associate with the media content from an audio recording mechanism or a memory device. The application is then able to store the audio data on the storage database and use the network address of the audio data along with the network address of the media content to publish the audio data and the media content such that a view is able to hear and access them concurrently at a network-accessible location.

Type: Application

Filed: March 30, 2012

Publication date: October 4, 2012

Inventor: Christopher C. Andrews
VOICE PROCESSOR AND VOICE PROCESSING METHOD

Publication number: 20120253804

Abstract: According to one embodiment, a voice processor includes: a storage module; a converter; a character string converter; a similarity calculator; and an output module. The storage module stores therein first character string information and a first phoneme symbol corresponding thereto in association with each other. The converter converts an input voice into a second phoneme symbol. The character string converter converts the second phoneme symbol into second character string information in which content of the voice is described in a natural language. The similarity calculator calculates similarity between the input voice and a portion of the first character string information stored in the storage module using at least one of the second phoneme symbol converted by the converter and the second character string information converted by the character string converter. The output module outputs the first character string information based on the similarity calculated by the similarity calculator.

Type: Application

Filed: December 16, 2011

Publication date: October 4, 2012

Applicant: Kabushiki Kaisha Toshiba

Inventors: Chikashi SUGIURA, Hiroshi Fujimura, Akinori Kawamura, Takashi Sudo
VOICE RECOGNITION DEVICE AND VOICE RECOGNITION METHOD

Publication number: 20120253803

Abstract: According to embodiments, a voice inputting unit converts voice into a digital signal. The state detecting unit includes an acceleration sensor, and detects movement and/or a state of an equipment main body. The holding unit stores movement or state pattern models of predetermined movement or a state of the equipment main body and predetermined voice recognition process patterns corresponding to the models. The pattern detecting unit detects whether or not movement and/or a state of the equipment main body from the state detecting unit matches the movement or state pattern models stored in the holding unit, and detects a voice recognition process pattern corresponding to the matched model. The voice recognition process executing unit executes the voice recognition process on the digital signal output from the voice inputting unit according to the detected voice recognition process pattern.

Type: Application

Filed: November 2, 2011

Publication date: October 4, 2012

Inventors: Motonobu Sugiura, Hiroshi Fujimura
AUTOMATIC DETERMINATION OF AND RESPONSE TO A TOPIC OF A CONVERSATION

Publication number: 20120253801

Abstract: A system, computer-readable medium, and method for automatically determining a topic of a conversation and responding to the topic determination are provided. In the method, an active topic is defined as a first topic in response to execution of an application. The first topic includes first text defining a plurality of phrases, a probability of occurrence associated with each of the plurality of phrases, and a response associated with each of the plurality of phrases. Speech text recognized from a recorded audio signal is received. Recognition of the speech text is based at least partially on the probability of occurrence associated with each of the plurality of phrases of the first topic. A phrase of the plurality of phrases associated with the received speech text is identified. The response associated with the identified phrase is performed by the computing device.

Type: Application

Filed: March 28, 2011

Publication date: October 4, 2012

Inventors: Chris Santos-Lang, Sumit Rana, Binit Mohanty, Rajeev Gangwar
ELECTRONIC DEVICE AND SERVER FOR PROCESSING VOICE MESSAGE

Publication number: 20120245935

Abstract: An electronic device includes a voice processing unit, a wireless communication unit, and a combining unit. The voice processing unit receives speech signals. The wireless communication unit sends the speech signals to a server. The server converts the speech signals into a text message. The wireless communication unit receives the text message from the server. The combining unit combines the text message and the speech signals into a combined message. The wireless communication unit further sends the combined message to a recipient. A related server is also provided.

Type: Application

Filed: June 30, 2011

Publication date: September 27, 2012

Applicants: HON HAI PRECISION INDUSTRY CO., LTD., FU TAI HUA INDUSTRY (SHENZHEN) CO., LTD.

Inventors: SHIH-FANG WONG, TSUNG-JEN CHUANG, BO ZHANG
Voice Rendering Of E-mail With Tags For Improved User Experience

Publication number: 20120245937

Abstract: Tags, such as XML tags, are inserted into email to separate email content from signature blocks, privacy notices and confidentiality notices, and to separate original email messages from replies and replies from further replies. The tags are detected by a system that renders email as speech, such as voice command platform or network-based virtual assistant or message center. The system can render an original email message in one voice mode and the reply in a different voice mode. The tags can be inserted to identify a voice memo in which a user responds to a particular portion of an email message.

Type: Application

Filed: April 3, 2012

Publication date: September 27, 2012

Applicant: Sprint Spectrum L.P.

Inventors: Balaji S. Thenthiruperai, Elizabeth Roche, Brian Landers, Jesse Kates
SPEECH RECOGNITION DEPENDENT ON TEXT MESSAGE CONTENT

Publication number: 20120245934

Abstract: A method of automatic speech recognition. An utterance is received from a user in reply to a text message, via a microphone that converts the reply utterance into a speech signal. The speech signal is processed using at least one processor to extract acoustic data from the speech signal. An acoustic model is identified from a plurality of acoustic models to decode the acoustic data, and using a conversational context associated with the text message. The acoustic data is decoded using the identified acoustic model to produce a plurality of hypotheses for the reply utterance.

Type: Application

Filed: March 25, 2011

Publication date: September 27, 2012

Applicant: GENERAL MOTORS LLC

Inventors: Gaurav Talwar, Xufang Zhao
Device to Capture and Temporally Synchronize Aspects of a Conversation and Method and System Thereof

Publication number: 20120245936

Abstract: A system, device, and method for capturing and temporally synchronizing different aspect of a conversation is presented. The method includes receiving an audible statement, receiving a note temporally corresponding to an utterance in the audible statement, creating a first temporal marker comprising temporal information related to the note, transcribing the utterance into a transcribed text, creating a second temporal marker comprising temporal information related to the transcribed text, temporally synchronizing the audible statement, the note, and the transcribed text. Temporally synchronizing comprises associating a time point in the audible statement with the note using the first temporal marker, associating the time point in the audible statement with the transcribed text using the second temporal marker, and associating the note with the transcribed text using the first temporal marker and second temporal marker.

Type: Application

Filed: March 26, 2012

Publication date: September 27, 2012

Inventor: Bryan Treglia
Selection of Text Prediction Results by an Accessory

Publication number: 20120239395

Abstract: A method for entering text in a text input field using a non-keyboard type accessory includes selecting a character for entry into the text field presented by a portable computing device. The portable computing device determines whether text suggestions are available based on the character. If text suggestions are available, the portable computing device can determine the text suggestions and send them to the accessory, which in turn can display the suggestions on a display. A user operating the accessory can select one of the text suggestions, expressly reject the text suggestions, or ignore the text suggestions. If a text suggestion is selected, the accessory can send the selected text to the portable computing device for populating the text field.

Type: Application

Filed: March 14, 2011

Publication date: September 20, 2012

Applicant: APPLE INC.

Inventor: Edwin W. Foo
Digital Ink Database Searching Using Handwriting Feature Synthesis

Publication number: 20120239397

Abstract: A method of searching a digital ink database is disclosed. The digital ink database is associated with a specific author. The method starts by receiving a computer text query from an input device. The computer text query is then mapped to a set of feature vectors using a handwriting model of that specific author. As a result, the set of feature vectors approximates features that would have been extracted had that specific author written the computer query text by hand. Finally, the set of feature vectors is used to search the digital ink database.

Type: Application

Filed: May 29, 2012

Publication date: September 20, 2012

Inventors: Jonathon Leigh Napper, Paul Lapstun
MULTIMODAL REMOTE CONTROL

Publication number: 20120239396

Abstract: A method and system for operating a remotely controlled device may use multimodal remote control commands that include a gesture command and a speech command. The gesture command may be interpreted from a gesture performed by a user, while the speech command may be interpreted from speech utterances made by the user. The gesture and speech utterances may be simultaneously received by the remotely controlled device in response to displaying a user interface configured to receive multimodal commands.

Type: Application

Filed: March 15, 2011

Publication date: September 20, 2012

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Michael James Johnston, Marcelo Worsley
Locating Products in Stores Using Voice Search From a Communication Device

Publication number: 20120232897

Abstract: A user can locate products by dialing a number from any phone and accessing an automatic voice recognition system. Reply is made to the user with information locating the product using a store's product location data converted to automatic voice responses. Smart phone and mobile web access to a product database is enabled using voice-to-text and text search. A taxonomy enables product search requests by product descriptions and/or product brand names, and enable synonyms and phonetic enhancements to the system. Search results are related to products and product categories with concise organization. Relevant advertisements, promotional offers and coupons are delivered based upon search and taxonomy elements. Search requests generate dynamic interior maps of a products location inside the shoppers' location, assisting a shopper to efficiently shop the location for listed items. Business intelligence of product categories enable rapid scaling across retail segments.

Type: Application

Filed: May 1, 2012

Publication date: September 13, 2012

Inventors: Nathan Pettyjohn, Matthew Kulig, Niarcas Jeffrey, Edward Saunders
SYSTEM AND METHOD OF PROVIDING AN AUTOMATED DATA-COLLECTION IN SPOKEN DIALOG SYSTEMS

Publication number: 20120232898

Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.

Type: Application

Filed: May 21, 2012

Publication date: September 13, 2012

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur
METHOD AND SYSTEM FOR SEGMENTING AND TRANSMITTING ON-DEMAND LIVE-ACTION VIDEO IN REAL-TIME

Publication number: 20120219271

Abstract: A method and system for producing video-segments of a live-action event involving monitoring a live-action event for detection of event-segments, detecting one or more event-triggers with detectors, determining if an event-segment occurred based on the detected event-triggers, and editing one or more video feeds into a video-segment to encompass the event-segment.

Type: Application

Filed: May 19, 2011

Publication date: August 30, 2012

Applicant: ON DEMAND REAL TIME LLC

Inventors: Douglas W. VUNIC, Eric HOFFERT, David GESSEL
Method and Apparatus for Automatically Building Conversational Systems

Publication number: 20120221331

Abstract: A system and method provides a natural language interface to world-wide web content. Either in advance or dynamically, webpage content is parsed using a parsing algorithm. A person using a telephone interface can provide speech information, which is converted to text and used to automatically fill in input fields on a webpage form. The form is then submitted to a database search and a response is generated. Information contained on the responsive webpage is extracted and converted to speech via a text-to-speech engine and communicated to the person.

Type: Application

Filed: May 7, 2012

Publication date: August 30, 2012

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Srinivas BANGALORE, Mazin G. Rahim, Junlan Feng
LEVERAGING SPEECH RECOGNIZER FEEDBACK FOR VOICE ACTIVITY DETECTION

Publication number: 20120221330

Abstract: A voice activity detection (VAD) module analyzes a media file, such as an audio file or a video file, to determine whether one or more frames of the media file include speech. A speech recognizer generates feedback relating to an accuracy of the VAD determination. The VAD module leverages the feedback to improve subsequent VAD determinations. The VAD module also utilizes a look-ahead window associated with the media file to adjust estimated probabilities or VAD decisions for previously processed frames.

Type: Application

Filed: February 25, 2011

Publication date: August 30, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Albert Joseph Kishan Thambiratnam, Weiwu Zhu, Frank Torsten Bernd Seide
SYSTEM AND METHOD FOR REFERRING TO ENTITIES IN A DISCOURSE DOMAIN

Publication number: 20120221332

Abstract: Systems, methods, and non-transitory computer-readable media for referring to entities. The method includes receiving domain-specific training data of sentences describing a target entity in a context, extracting a speaker history and a visual context from the training data, selecting attributes of the target entity based on at least one of the speaker history, the visual context, and speaker preferences, generating a text expression referring to the target entity based on at least one of the selected attributes, the speaker history, and the context, and outputting the generated text expression. The weighted finite-state automaton can represent partial orderings of word pairs in the domain-specific training data. The weighted finite-state automaton can be speaker specific or speaker independent. The weighted finite-state automaton can include a set of weighted partial orderings of the training data for each possible realization.

Type: Application

Filed: May 7, 2012

Publication date: August 30, 2012

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Giuseppe Di Fabbrizio, Srinivas Bangalore, Amanda Stent
HEARING ASSISTANCE SYSTEM FOR PROVIDING CONSISTENT HUMAN SPEECH

Publication number: 20120215532

Abstract: Broadly speaking, the embodiments disclosed herein describe an apparatus, system, and method that allows a user of a hearing assistance system to perceive consistent human speech. The consistent human speech can be based upon user specific preferences.

Type: Application

Filed: February 22, 2011

Publication date: August 23, 2012

Applicant: APPLE INC.

Inventors: Edwin W. Foo, Gregory F. Hughes
System and Method for Automatic Storage and Retrieval of Emergency Information

Publication number: 20120215534

Abstract: A vehicle communication system includes a computer processor in communication with a memory circuit, a transceiver in communication with the processor and operable to communicate with one or more wireless devices, and one or more storage locations storing one or more pieces of emergency contact information. In this illustrative system, the processor is operable to establish communication with a first wireless device through the transceiver. Upon detection of an emergency event by at least one vehicle based sensor system, the vehicle communication system is operable to contact an emergency operator. The vehicle communication system is further operable to display one or more of the one or more pieces of emergency contact information in a selectable manner. Upon selection of one of the one or more pieces of emergency contact information, the vehicle computing system places a call to a phone number associated with the selected emergency contact.

Type: Application

Filed: April 30, 2012

Publication date: August 23, 2012

Applicant: FORD GLOBAL TECHNOLOGIES, LLC

Inventor: David Anthony Hatton
Method of and System for Error Correction in Multiple Input Modality Search Engines

Publication number: 20120215533

Abstract: A method of and system for error correction in multiple input modality search engines is presented. A method of processing input information based on an information type of the input information includes receiving input information for performing a search for identifying at least one item desired by a user and determining an information type associated with the input information. The method also includes forming a query input for identifying the at least one item desired by the user based on the input information and on the information type. The method further includes submitting the query input to at least one search engine system.

Type: Application

Filed: January 25, 2012

Publication date: August 23, 2012

Applicant: Veveo, Inc.

Inventors: Murali Aravamudan, Pankaj Garg, Rakesh Barve, Ajit Rajasekharan
METHOD AND APPARATUS FOR DATA EXPLORATION OF INTERACTIONS

Publication number: 20120209605

Abstract: Retrieving data from audio interactions associated with an organization. Retrieving the data comprises: receiving a corpus containing interactions; performing natural language processing on a text document representing an interaction from the corpus; extracting at least one keyphrase from the text document; assigning a rank to the at least one keyphrase; modeling relations between at least two keyphrases using the rank; and identifying topics relevant for the organization from the relations.

Type: Application

Filed: February 14, 2011

Publication date: August 16, 2012

Applicant: Nice Systems Ltd.

Inventors: Eyal Hurvitz, Maya Gorodetsky, Ezra Daya, Oren Pereg

prev 1 2 3 4 5 6 7 … next