Speech To Text Systems (epo) Patents (Class 704/E15.043)
  • Publication number: 20120316874
    Abstract: A system and method of radiology verification is provided. The verification may be implemented as a standalone software utility, as part of a radiology imaging graphical user interface, or within a more complex computing system configured for generating radiology reports.
    Type: Application
    Filed: April 11, 2012
    Publication date: December 13, 2012
    Inventor: Brian T. Lipman
  • Publication number: 20120316875
    Abstract: Embodiments of the invention provide systems and methods for speech signal handling. Speech handling according to one embodiment of the present invention can be performed via a hosted architecture. Electrical signal representing human speech can be analyzed with an Automatic Speech Recognizer (ASR) hosted on a different server from a media server or other server hosting a service utilizing speech input. Neither server need be located at the same location as the user. The spoken sounds can be accepted as input to and handled with a media server which identifies parts of the electrical signal that contain a representation of speech. This architecture can serve any user who has a web-browser and Internet access, either on a PC, PDA, cell phone, tablet, or any other computing device.
    Type: Application
    Filed: June 8, 2012
    Publication date: December 13, 2012
    Applicant: Red Shift Company, LLC
    Inventors: JOEL NYQUIST, Matthew Robinson
  • Publication number: 20120310642
    Abstract: Techniques are provided for creating a mapping that maps locations in audio data (e.g., an audio book) to corresponding locations in text data (e.g., an e-book). Techniques are provided for using a mapping between audio data and text data, whether or not the mapping is created automatically or manually. A mapping may be used for bookmark switching where a bookmark established in one version of a digital work is used to identify a corresponding location with another version of the digital work. Alternatively, the mapping may be used to play audio that corresponds to text selected by a user. Alternatively, the mapping may be used to automatically highlight text in response to audio that corresponds to the text being played. Alternatively, the mapping may be used to determine where an annotation created in one media context (e.g., audio) will be consumed in another media context (e.g., text).
    Type: Application
    Filed: October 6, 2011
    Publication date: December 6, 2012
    Applicant: APPLE INC.
    Inventors: Xiang Cao, Alan C. Cannistraro, Gregory S. Robbin, Casey M. Dougherty
  • Publication number: 20120310622
    Abstract: A single device allows two or more users to converse in different languages. The translation device receives inputs from the users which are translated and displayed to the other user in the other user's selected language. In one embodiment, there are two display areas for right side up display of the conversation to each user. In a second embodiment, one display is changed from one language to another as it is passed from one user to the other.
    Type: Application
    Filed: April 26, 2012
    Publication date: December 6, 2012
    Applicant: ORTSBO, INC.
    Inventors: Aleksandar Zivkovic, Mark Charles Hale, Justin Earl Marek
  • Publication number: 20120310643
    Abstract: Techniques for presenting data input as a plurality of data chunks including a first data chunk and a second data chunk. The techniques include converting the plurality of data chunks to a textual representation comprising a plurality of text chunks including a first text chunk corresponding to the first data chunk and a second text chunk corresponding to the second data chunk, respectively, and providing a presentation of at least part of the textual representation such that the first text chunk is presented differently than the second text chunk to, when presented, assist a user in proofing the textual representation.
    Type: Application
    Filed: May 23, 2012
    Publication date: December 6, 2012
    Applicant: Nuance Communications, Inc.
    Inventors: Martin Labsky, Jan Kleindienst, Tomas Macek, David Nahamoo, Jan Curin, Lars Koenig, Holger Quast
  • Publication number: 20120310644
    Abstract: A computer program product, for automatically editing a medical record transcription, resides on a computer-readable medium and includes computer-readable instructions for causing a computer to obtain a first medical transcription of a dictation, the dictation being from medical personnel and concerning a patient, analyze the first medical transcription for presence of a first trigger phrase associated with a first standard text block, determine that the first trigger phrase is present in the first medical transcription if an actual phrase in the first medical transcription corresponds with the first trigger phrase, and insert the first standard text block into the first medical transcription.
    Type: Application
    Filed: August 13, 2012
    Publication date: December 6, 2012
    Applicant: eScription Inc.
    Inventors: Roger S. Zimmerman, Paul Egerman, Robert G. Titemore, George Zavaliagkos
  • Publication number: 20120310645
    Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.
    Type: Application
    Filed: August 14, 2012
    Publication date: December 6, 2012
    Applicant: GOOGLE INC.
    Inventors: Alexander Gruenstein, William J. Byrne
  • Publication number: 20120303368
    Abstract: The present invention discloses a number-assistant voice input system, a number-assistant voice input method for a voice input system and a number-assistant voice correcting method for a voice input system, which apply software to drive a voice input system of an electronic device to provide a voice input logic circuit module. The voice input logic circuit module defines the pronunciation of numbers 1 to 26 as the paths to respectively input letters A to Z in the voice input system and allows users to selectively input or correct a letter by reading a number from 1 to 26 instead of a letter from A to Z.
    Type: Application
    Filed: May 27, 2011
    Publication date: November 29, 2012
    Inventor: Ting MA
  • Publication number: 20120296646
    Abstract: Concepts and technologies are described herein for multi-mode text input. In accordance with the concepts and technologies disclosed herein, content is received. The content can include one or more input indicators. The input indicators can indicate that user input can be used in conjunction with consumption or use of the content. The application is configured to analyze the content to determine context associated with the content and/or the client device executing the application. The application also is configured to determine, based upon the content and/or the contextual information, which input device to use to obtain input associated with use or consumption of the content. Input captured with the input device can be converted to text and used during use or consumption of the content.
    Type: Application
    Filed: May 17, 2011
    Publication date: November 22, 2012
    Applicant: Microsoft Corporation
    Inventors: Mohan Varthakavi, Jayaram Nanduri, Nikhil Kothari
  • Publication number: 20120296647
    Abstract: In an embodiment, an information processing apparatus includes: a converting unit; a selecting unit; a dividing unit; a generating unit; and a display processing unit. The converting unit recognizes a voice input from a user into a character string. The selecting unit selects characters from the character string according to designation of the user. The dividing unit converts the selected characters into phonetic characters and divides the phonetic characters into phonetic characters of sound units. The generating unit extracts similar character candidates corresponding to each of the divided phonetic characters of the sound units, from a similar character dictionary storing a plurality of phonetic characters of sound units similar in sound as the similar character candidates in association with each other, and generates correction character candidates for the selected characters. The display processing unit makes a display unit display the generated correction character candidates selectable by the user.
    Type: Application
    Filed: May 23, 2012
    Publication date: November 22, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Yuka Kobayashi, Tetsuro Chino, Kazuo Sumita, Hisayoshi Nagae, Satoshi Kamatani
  • Publication number: 20120290299
    Abstract: Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.
    Type: Application
    Filed: May 13, 2011
    Publication date: November 15, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sara H. Basson, Rick Hamilton, Dan Ning Jiang, Dimitri Kanevsky, David Nahamoo, Michael Picheny, Bhuvana Ramabhadran, Tara N. Sainath
  • Publication number: 20120290290
    Abstract: Sentence simplification may be provided. A spoken phrase may be received and converted to a text phrase. An intent associated with the text phrase may be identified. The text phrase may then be reformatted according to the identified intent and a task may be performed according to the reformatted text phrase.
    Type: Application
    Filed: May 12, 2011
    Publication date: November 15, 2012
    Applicant: Microsoft Corporation
    Inventors: Gokhan Tur, Dilek Hakkani-Tur, Larry Paul Heck, Sarangarajan Parthasarathy
  • Publication number: 20120290298
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.
    Type: Application
    Filed: May 9, 2011
    Publication date: November 15, 2012
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Andrej LJOLJE, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
  • Publication number: 20120290300
    Abstract: The apparatus for foreign language study includes: a voice recognition device configured to recognize a speech entered by a user and convert the speech into a speech text; a speech intent recognition device configured to extract a user speech intent for the speech text using skill level information of the user and dialogue context information; and a feedback processing device configured to extract a different expression depending on the user speech intent and a speech situation of the user. According to the present invention, the intent of a learner's speech may be determined even though the learner's skill is low, and customized expressions for various situations may be provided to the learner.
    Type: Application
    Filed: October 15, 2010
    Publication date: November 15, 2012
    Applicant: POSTECH ACADEMY- INDUSTRY FOUNDATION
    Inventors: Sung Jin Lee, Cheong Jae Lee, Gary Geunbae Lee
  • Publication number: 20120290301
    Abstract: A system includes at least one wireless client device, a service manager, and a plurality of voice transcription servers. The service manager includes a resource management service and a profile management service. The client device communicates the presence of a voice transcription task to the resource management service. The resource management service surveys the plurality of voice transcription servers and selects one voice transcription server based on a set of predefined criteria. The resource management service then communicated an address of the selected server to the profile management service, which then transmits a trained voice profile or default profile to the selected server. The address of the selected server is then sent to the client device, which then transmits an audio stream to the server. Finally, the selected server transcribes the audio stream to a textual format.
    Type: Application
    Filed: July 30, 2012
    Publication date: November 15, 2012
    Applicant: Nuance Communications, Inc.
    Inventors: Amarjit S. Bahl, Dalia Massoud, Dikran S. Meliksetian, Chen Shu, Michael Van Der Meulen, Nianjun Zhou
  • Publication number: 20120284024
    Abstract: A computerized communication device has a display screen, a mechanism for a user to select words or phrases displayed on the display screen, and software executing from a non-transitory physical medium, the software providing a function for providing audio signal output in a connected voice-telephone call from the text words or phrases selected by a user.
    Type: Application
    Filed: May 3, 2011
    Publication date: November 8, 2012
    Inventor: Padmanabhan Mahalingam
  • Publication number: 20120278066
    Abstract: A communication interface apparatus for a system and a plurality of users is provided. The communication interface apparatus for the system and the plurality of users includes a first process unit configured to receive voice information and face information from at least one user, and determine whether the received voice information is voice information of at least one registered user based on user models corresponding to the respective received voice information and face information; a second process unit configured to receive the face information, and determine whether the at least one user's attention is on the system based on the received face information; and a third process unit configured to receive the voice information, analyze the received voice information, and determine whether the received voice information is substantially meaningful to the system based on a dialog model that represents conversation flow on a situation basis.
    Type: Application
    Filed: November 9, 2010
    Publication date: November 1, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Nam-Hoon Kim, Chi-Youn Park, Jeong-Mi Cho, Jeong-su Kim
  • Publication number: 20120278071
    Abstract: A transcription system automates the control of the playback of the audio to accommodate the user's ability to transcribe the words spoken. In some examples, a delay between playback and typed input is estimated by processing the typed words using a wordspotting approach. The estimated delay is used as in input to an automated speed control, for example, to maintain a target or maximum delay between playback and typed input.
    Type: Application
    Filed: April 29, 2011
    Publication date: November 1, 2012
    Applicant: Nexidia Inc.
    Inventors: Jacob B. Garland, Marsal Gavalda
  • Publication number: 20120278074
    Abstract: A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters.
    Type: Application
    Filed: July 10, 2012
    Publication date: November 1, 2012
    Applicant: Google Inc.
    Inventors: Dave Burke, Michael J. Lebeau, Konrad Gianno, Trausti Kristjansson, John Nicholas Jitkoff, Andrew W. Senior
  • Publication number: 20120278073
    Abstract: A mobile system is provided that includes speech-based and non-speech-based interfaces for telematics applications. The mobile system identifies and uses context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for users that submit requests and/or commands in multiple domains. The invention creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command. The invention may organize domain specific behavior and information into agents, that are distributable or updateable over a wide area network.
    Type: Application
    Filed: June 4, 2012
    Publication date: November 1, 2012
    Applicant: VoiceBox Technologies, Inc.
    Inventors: CHRIS WEIDER, Richard Kennewick, Mike Kennewick, Philippe Di Cristo, Robert A. Kennewick, Samuel Menaker, Lynn Elise Armstrong
  • Publication number: 20120278072
    Abstract: A remote healthcare system includes a healthcare staff terminal which includes an input part configured to input text to be transmitted to a patient by a healthcare staff member, and a first transmitter-receiver part configured to transmit the text and a qualifier of the healthcare staff member; a server which includes a second transmitter-receiver part configured to receive the text and the qualifier of the healthcare staff member transmitted from the healthcare staff terminal, an acoustic source database having an acoustic source of the healthcare staff member stored therein, and a converter configured to change the text into voice using the stored acoustic source of the healthcare staff member; and a patient terminal which includes a third transmitter-receiver part configured to receive the voice converted from the text and the text transmitted by the second transmitter-receiver part of the server, and an output part configured to output the voice to the patient who is managed by the healthcare staff membe
    Type: Application
    Filed: March 27, 2012
    Publication date: November 1, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jeong Je PARK, Kwang Hyeon LEE
  • Publication number: 20120262533
    Abstract: A method is provided in one example and includes identifying a particular word recited by an active speaker in a conference involving a plurality of endpoints in a network environment; evaluating a profile associated with the active speaker in order to identify contextual information associated with the particular word; and providing augmented data associated with the particular word to at least some of the plurality of endpoints. In more specific examples, the active speaker is identified using a facial detection protocol, or a speech recognition protocol. Data from the active speaker can be converted from speech to text.
    Type: Application
    Filed: April 18, 2011
    Publication date: October 18, 2012
    Inventors: Satish K. Gannu, Leon A. Frazier, Didier R. Moretti
  • Publication number: 20120265527
    Abstract: An interactive voice recognition electronic device converts a received voice signal to a text, and searches a voice databases to find a matched voice text of the converted text. The matched voice text is taken as a recognized voice text of the voice signal if the matched voice text exists in the voice database. The electronic device obtains a predetermined number of similar voice texts if no matched voice text exists in the voice database. The electronic device converts the predetermined number of similar voice texts to the voice signals, outputs the converted voice signals in turn, and selects one of the similar voice texts as the recognized voice text according to the selection of the user. The electronic device obtains the associated answer text of the recognized voice text in the voice database and converts the answer text to voice signals.
    Type: Application
    Filed: August 9, 2011
    Publication date: October 18, 2012
    Applicants: HON HAI PRECISION INDUSTRY CO., LTD., FU TAI HUA INDUSTRY (SHENZHEN) CO., LTD.
    Inventors: YU-KAI XIONG, XIN LU, SHIH-FANG WONG, DONG-SHENG LV, XIN-HUA LI, YU-YONG ZHANG, JIAN-JIAN ZHU
  • Publication number: 20120265529
    Abstract: A method and system for transcription of spoken language into continuous text for a user comprising the steps of inputting spoken language of at least one user or of a communication partner of the at least one user into a mobile device of the respective user, wherein the input spoken language of the user is transported within a corresponding stream of voice over IP data packets to a transcription server; transforming the spoken language transported within the respective stream of voice over IP data packets into continuous text by means of a speech recognition algorithm run by said transcription server, wherein said speech recognition algorithm is selected depending on a natural language or dialect spoken in the area of the current position of said mobile device; and outputting said transformed continuous text forwarded by said transcription server to said mobile device of the respective user or to a user terminal of the respective user in real time.
    Type: Application
    Filed: October 27, 2010
    Publication date: October 18, 2012
    Inventors: Michaela Nachtrab, Robin Ribback
  • Publication number: 20120259633
    Abstract: A completely hands free exchange of messages, especially in portable devices, is provided through a combination of speech recognition, text-to-speech (TTS), and detection algorithms. An incoming message may be read aloud to a user and the user enabled to respond to the sender with a reply message through audio input upon determining whether the audio interaction mode is proper. Users may also be provided with options for responding in a different communication mode (e.g., a call) or perform other actions. Users may further be enabled to initiate a message exchange using natural language.
    Type: Application
    Filed: April 7, 2011
    Publication date: October 11, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Liane Aihara, Shane Landry, Lisa Stifelman, Madhusudan Chinthakunta, Anne Sullivan, Kathleen Lee
  • Publication number: 20120259634
    Abstract: There is provided a music playback device comprising a playback unit configured to playback music, an analysis unit configured to analyze lyrics of the music and extract a word or a phrase included in the lyrics, an acquisition unit configured to acquire an image using the word or the phrase extracted by the analysis unit, and a display control unit configured to, during playback of the music, cause a display device to display the image acquired by the acquisition unit.
    Type: Application
    Filed: February 16, 2012
    Publication date: October 11, 2012
    Applicant: Sony Corporation
    Inventor: Motoki TSUNOKAWA
  • Publication number: 20120259635
    Abstract: A system for the storing of client information in an independent repository is disclosed. Client data may be uploaded by client or those authorized by client or collected and stored by the repository. Data about the client file such as, for example, the time of upload and modifications are stored in a metadata file associated with the client file.
    Type: Application
    Filed: April 5, 2012
    Publication date: October 11, 2012
    Inventors: Gregory J. Ekchian, Jack A. Ekchian
  • Publication number: 20120257786
    Abstract: There is provided a method that includes (a) receiving image data, (b) processing the image data to yield first data, (c) obtaining second data from a repository, based on the first data, and (d) storing the first data and the second data as a record in a database. There is also provided a method that includes (a) receiving image data, (b) processing the image data to yield first data and second data, (c) matching the first data to a record that is stored in a database, and (d) updating the record to include the second data. There is also provided a system that performs the methods, and a storage medium that contains instructions that control a processor to perform the methods.
    Type: Application
    Filed: April 5, 2012
    Publication date: October 11, 2012
    Applicant: THE DUN & BRADSTREET CORPORATION
    Inventor: Daniel Scott Camper
  • Publication number: 20120259636
    Abstract: Some embodiments relate to a method of performing a search for content on the Internet, in which a user may speak a search query and speech recognition may be performed on the spoken query to generate a text search query to be provided to a plurality of search engines. This enables a user to speak the search query rather than having to type it, and also allows the user to provide the search query only once, rather than having to provide it separately to multiple different search engines.
    Type: Application
    Filed: June 19, 2012
    Publication date: October 11, 2012
    Applicant: Nuance Communications, Inc.
    Inventors: Vladimir Sejnoha, William F. Ganong, III, Paul J. Vozila, Nathan M. Bodenstab, Yik-Cheung Tam
  • Publication number: 20120253795
    Abstract: An audio commenting and publishing system including a storage database, media content and a computing device all coupled together via a network. The computing device comprises a processor and an application executed by the processor configured to input audio data that a user wishes to associate with the media content from an audio recording mechanism or a memory device. The application is then able to store the audio data on the storage database and use the network address of the audio data along with the network address of the media content to publish the audio data and the media content such that a view is able to hear and access them concurrently at a network-accessible location.
    Type: Application
    Filed: March 30, 2012
    Publication date: October 4, 2012
    Inventor: Christopher C. Andrews
  • Publication number: 20120253804
    Abstract: According to one embodiment, a voice processor includes: a storage module; a converter; a character string converter; a similarity calculator; and an output module. The storage module stores therein first character string information and a first phoneme symbol corresponding thereto in association with each other. The converter converts an input voice into a second phoneme symbol. The character string converter converts the second phoneme symbol into second character string information in which content of the voice is described in a natural language. The similarity calculator calculates similarity between the input voice and a portion of the first character string information stored in the storage module using at least one of the second phoneme symbol converted by the converter and the second character string information converted by the character string converter. The output module outputs the first character string information based on the similarity calculated by the similarity calculator.
    Type: Application
    Filed: December 16, 2011
    Publication date: October 4, 2012
    Applicant: Kabushiki Kaisha Toshiba
    Inventors: Chikashi SUGIURA, Hiroshi Fujimura, Akinori Kawamura, Takashi Sudo
  • Publication number: 20120253803
    Abstract: According to embodiments, a voice inputting unit converts voice into a digital signal. The state detecting unit includes an acceleration sensor, and detects movement and/or a state of an equipment main body. The holding unit stores movement or state pattern models of predetermined movement or a state of the equipment main body and predetermined voice recognition process patterns corresponding to the models. The pattern detecting unit detects whether or not movement and/or a state of the equipment main body from the state detecting unit matches the movement or state pattern models stored in the holding unit, and detects a voice recognition process pattern corresponding to the matched model. The voice recognition process executing unit executes the voice recognition process on the digital signal output from the voice inputting unit according to the detected voice recognition process pattern.
    Type: Application
    Filed: November 2, 2011
    Publication date: October 4, 2012
    Inventors: Motonobu Sugiura, Hiroshi Fujimura
  • Publication number: 20120253801
    Abstract: A system, computer-readable medium, and method for automatically determining a topic of a conversation and responding to the topic determination are provided. In the method, an active topic is defined as a first topic in response to execution of an application. The first topic includes first text defining a plurality of phrases, a probability of occurrence associated with each of the plurality of phrases, and a response associated with each of the plurality of phrases. Speech text recognized from a recorded audio signal is received. Recognition of the speech text is based at least partially on the probability of occurrence associated with each of the plurality of phrases of the first topic. A phrase of the plurality of phrases associated with the received speech text is identified. The response associated with the identified phrase is performed by the computing device.
    Type: Application
    Filed: March 28, 2011
    Publication date: October 4, 2012
    Inventors: Chris Santos-Lang, Sumit Rana, Binit Mohanty, Rajeev Gangwar
  • Publication number: 20120245935
    Abstract: An electronic device includes a voice processing unit, a wireless communication unit, and a combining unit. The voice processing unit receives speech signals. The wireless communication unit sends the speech signals to a server. The server converts the speech signals into a text message. The wireless communication unit receives the text message from the server. The combining unit combines the text message and the speech signals into a combined message. The wireless communication unit further sends the combined message to a recipient. A related server is also provided.
    Type: Application
    Filed: June 30, 2011
    Publication date: September 27, 2012
    Applicants: HON HAI PRECISION INDUSTRY CO., LTD., FU TAI HUA INDUSTRY (SHENZHEN) CO., LTD.
    Inventors: SHIH-FANG WONG, TSUNG-JEN CHUANG, BO ZHANG
  • Publication number: 20120245937
    Abstract: Tags, such as XML tags, are inserted into email to separate email content from signature blocks, privacy notices and confidentiality notices, and to separate original email messages from replies and replies from further replies. The tags are detected by a system that renders email as speech, such as voice command platform or network-based virtual assistant or message center. The system can render an original email message in one voice mode and the reply in a different voice mode. The tags can be inserted to identify a voice memo in which a user responds to a particular portion of an email message.
    Type: Application
    Filed: April 3, 2012
    Publication date: September 27, 2012
    Applicant: Sprint Spectrum L.P.
    Inventors: Balaji S. Thenthiruperai, Elizabeth Roche, Brian Landers, Jesse Kates
  • Publication number: 20120245934
    Abstract: A method of automatic speech recognition. An utterance is received from a user in reply to a text message, via a microphone that converts the reply utterance into a speech signal. The speech signal is processed using at least one processor to extract acoustic data from the speech signal. An acoustic model is identified from a plurality of acoustic models to decode the acoustic data, and using a conversational context associated with the text message. The acoustic data is decoded using the identified acoustic model to produce a plurality of hypotheses for the reply utterance.
    Type: Application
    Filed: March 25, 2011
    Publication date: September 27, 2012
    Applicant: GENERAL MOTORS LLC
    Inventors: Gaurav Talwar, Xufang Zhao
  • Publication number: 20120245936
    Abstract: A system, device, and method for capturing and temporally synchronizing different aspect of a conversation is presented. The method includes receiving an audible statement, receiving a note temporally corresponding to an utterance in the audible statement, creating a first temporal marker comprising temporal information related to the note, transcribing the utterance into a transcribed text, creating a second temporal marker comprising temporal information related to the transcribed text, temporally synchronizing the audible statement, the note, and the transcribed text. Temporally synchronizing comprises associating a time point in the audible statement with the note using the first temporal marker, associating the time point in the audible statement with the transcribed text using the second temporal marker, and associating the note with the transcribed text using the first temporal marker and second temporal marker.
    Type: Application
    Filed: March 26, 2012
    Publication date: September 27, 2012
    Inventor: Bryan Treglia
  • Publication number: 20120239395
    Abstract: A method for entering text in a text input field using a non-keyboard type accessory includes selecting a character for entry into the text field presented by a portable computing device. The portable computing device determines whether text suggestions are available based on the character. If text suggestions are available, the portable computing device can determine the text suggestions and send them to the accessory, which in turn can display the suggestions on a display. A user operating the accessory can select one of the text suggestions, expressly reject the text suggestions, or ignore the text suggestions. If a text suggestion is selected, the accessory can send the selected text to the portable computing device for populating the text field.
    Type: Application
    Filed: March 14, 2011
    Publication date: September 20, 2012
    Applicant: APPLE INC.
    Inventor: Edwin W. Foo
  • Publication number: 20120239397
    Abstract: A method of searching a digital ink database is disclosed. The digital ink database is associated with a specific author. The method starts by receiving a computer text query from an input device. The computer text query is then mapped to a set of feature vectors using a handwriting model of that specific author. As a result, the set of feature vectors approximates features that would have been extracted had that specific author written the computer query text by hand. Finally, the set of feature vectors is used to search the digital ink database.
    Type: Application
    Filed: May 29, 2012
    Publication date: September 20, 2012
    Inventors: Jonathon Leigh Napper, Paul Lapstun
  • Publication number: 20120239396
    Abstract: A method and system for operating a remotely controlled device may use multimodal remote control commands that include a gesture command and a speech command. The gesture command may be interpreted from a gesture performed by a user, while the speech command may be interpreted from speech utterances made by the user. The gesture and speech utterances may be simultaneously received by the remotely controlled device in response to displaying a user interface configured to receive multimodal commands.
    Type: Application
    Filed: March 15, 2011
    Publication date: September 20, 2012
    Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Michael James Johnston, Marcelo Worsley
  • Publication number: 20120232897
    Abstract: A user can locate products by dialing a number from any phone and accessing an automatic voice recognition system. Reply is made to the user with information locating the product using a store's product location data converted to automatic voice responses. Smart phone and mobile web access to a product database is enabled using voice-to-text and text search. A taxonomy enables product search requests by product descriptions and/or product brand names, and enable synonyms and phonetic enhancements to the system. Search results are related to products and product categories with concise organization. Relevant advertisements, promotional offers and coupons are delivered based upon search and taxonomy elements. Search requests generate dynamic interior maps of a products location inside the shoppers' location, assisting a shopper to efficiently shop the location for listed items. Business intelligence of product categories enable rapid scaling across retail segments.
    Type: Application
    Filed: May 1, 2012
    Publication date: September 13, 2012
    Inventors: Nathan Pettyjohn, Matthew Kulig, Niarcas Jeffrey, Edward Saunders
  • Publication number: 20120232898
    Abstract: The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.
    Type: Application
    Filed: May 21, 2012
    Publication date: September 13, 2012
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe Di Fabbrizio, Dilek Z. Hakkani-Tur, Mazin G. Rahim, Bernard S. Renger, Gokhan Tur
  • Publication number: 20120219271
    Abstract: A method and system for producing video-segments of a live-action event involving monitoring a live-action event for detection of event-segments, detecting one or more event-triggers with detectors, determining if an event-segment occurred based on the detected event-triggers, and editing one or more video feeds into a video-segment to encompass the event-segment.
    Type: Application
    Filed: May 19, 2011
    Publication date: August 30, 2012
    Applicant: ON DEMAND REAL TIME LLC
    Inventors: Douglas W. VUNIC, Eric HOFFERT, David GESSEL
  • Publication number: 20120221331
    Abstract: A system and method provides a natural language interface to world-wide web content. Either in advance or dynamically, webpage content is parsed using a parsing algorithm. A person using a telephone interface can provide speech information, which is converted to text and used to automatically fill in input fields on a webpage form. The form is then submitted to a database search and a response is generated. Information contained on the responsive webpage is extracted and converted to speech via a text-to-speech engine and communicated to the person.
    Type: Application
    Filed: May 7, 2012
    Publication date: August 30, 2012
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Srinivas BANGALORE, Mazin G. Rahim, Junlan Feng
  • Publication number: 20120221330
    Abstract: A voice activity detection (VAD) module analyzes a media file, such as an audio file or a video file, to determine whether one or more frames of the media file include speech. A speech recognizer generates feedback relating to an accuracy of the VAD determination. The VAD module leverages the feedback to improve subsequent VAD determinations. The VAD module also utilizes a look-ahead window associated with the media file to adjust estimated probabilities or VAD decisions for previously processed frames.
    Type: Application
    Filed: February 25, 2011
    Publication date: August 30, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Albert Joseph Kishan Thambiratnam, Weiwu Zhu, Frank Torsten Bernd Seide
  • Publication number: 20120221332
    Abstract: Systems, methods, and non-transitory computer-readable media for referring to entities. The method includes receiving domain-specific training data of sentences describing a target entity in a context, extracting a speaker history and a visual context from the training data, selecting attributes of the target entity based on at least one of the speaker history, the visual context, and speaker preferences, generating a text expression referring to the target entity based on at least one of the selected attributes, the speaker history, and the context, and outputting the generated text expression. The weighted finite-state automaton can represent partial orderings of word pairs in the domain-specific training data. The weighted finite-state automaton can be speaker specific or speaker independent. The weighted finite-state automaton can include a set of weighted partial orderings of the training data for each possible realization.
    Type: Application
    Filed: May 7, 2012
    Publication date: August 30, 2012
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Giuseppe Di Fabbrizio, Srinivas Bangalore, Amanda Stent
  • Publication number: 20120215532
    Abstract: Broadly speaking, the embodiments disclosed herein describe an apparatus, system, and method that allows a user of a hearing assistance system to perceive consistent human speech. The consistent human speech can be based upon user specific preferences.
    Type: Application
    Filed: February 22, 2011
    Publication date: August 23, 2012
    Applicant: APPLE INC.
    Inventors: Edwin W. Foo, Gregory F. Hughes
  • Publication number: 20120215534
    Abstract: A vehicle communication system includes a computer processor in communication with a memory circuit, a transceiver in communication with the processor and operable to communicate with one or more wireless devices, and one or more storage locations storing one or more pieces of emergency contact information. In this illustrative system, the processor is operable to establish communication with a first wireless device through the transceiver. Upon detection of an emergency event by at least one vehicle based sensor system, the vehicle communication system is operable to contact an emergency operator. The vehicle communication system is further operable to display one or more of the one or more pieces of emergency contact information in a selectable manner. Upon selection of one of the one or more pieces of emergency contact information, the vehicle computing system places a call to a phone number associated with the selected emergency contact.
    Type: Application
    Filed: April 30, 2012
    Publication date: August 23, 2012
    Applicant: FORD GLOBAL TECHNOLOGIES, LLC
    Inventor: David Anthony Hatton
  • Publication number: 20120215533
    Abstract: A method of and system for error correction in multiple input modality search engines is presented. A method of processing input information based on an information type of the input information includes receiving input information for performing a search for identifying at least one item desired by a user and determining an information type associated with the input information. The method also includes forming a query input for identifying the at least one item desired by the user based on the input information and on the information type. The method further includes submitting the query input to at least one search engine system.
    Type: Application
    Filed: January 25, 2012
    Publication date: August 23, 2012
    Applicant: Veveo, Inc.
    Inventors: Murali Aravamudan, Pankaj Garg, Rakesh Barve, Ajit Rajasekharan
  • Publication number: 20120209605
    Abstract: Retrieving data from audio interactions associated with an organization. Retrieving the data comprises: receiving a corpus containing interactions; performing natural language processing on a text document representing an interaction from the corpus; extracting at least one keyphrase from the text document; assigning a rank to the at least one keyphrase; modeling relations between at least two keyphrases using the rank; and identifying topics relevant for the organization from the relations.
    Type: Application
    Filed: February 14, 2011
    Publication date: August 16, 2012
    Applicant: Nice Systems Ltd.
    Inventors: Eyal Hurvitz, Maya Gorodetsky, Ezra Daya, Oren Pereg