Pattern Display Patents (Class 704/276)
  • Patent number: 9495955
    Abstract: Features are disclosed for generating acoustic models from an existing corpus of data. Methods for generating the acoustic models can include receiving at least one characteristic of a desired acoustic model, selecting training utterances corresponding to the characteristic from a corpus comprising audio data and corresponding transcription data, and generating an acoustic model based on the selected training utterances.
    Type: Grant
    Filed: January 2, 2013
    Date of Patent: November 15, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Frederick Victor Weber, Jeffrey Penrod Adams
  • Patent number: 9412363
    Abstract: A model-based approach for on-screen item selection and disambiguation is provided. An utterance may be received by a computing device in response to a display of a list of items for selection on a display screen. A disambiguation model may then be applied to the utterance. The disambiguation model may be utilized to determine whether the utterance is directed to at least one of the list of displayed items, extract referential features from the utterance and identify an item from the list corresponding to the utterance, based on the extracted referential features. The computing device may then perform an action which includes selecting the identified item associated with utterance.
    Type: Grant
    Filed: March 3, 2014
    Date of Patent: August 9, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ruhi Sarikaya, Fethiye Asli Celikyilmaz, Zhaleh Feizollahi, Larry Paul Heck, Dilek Z. Hakkani-Tur
  • Patent number: 9408572
    Abstract: Support structures for positioning sensors on a physiologic tunnel for measuring physical, chemical and biological parameters of the body and to produce an action according to the measured value of the parameters. The support structure includes a sensor fitted on the support structures using a special geometry for acquiring continuous and undisturbed data on the physiology of the body. Signals are transmitted to a remote station by wireless transmission such as by electromagnetic waves, radio waves, infrared, sound and the like or by being reported locally by audio or visual transmission. The physical and chemical parameters include brain function, metabolic function, hydrodynamic function, hydration status, levels of chemical compounds in the blood, and the like. The support structure includes patches, clips, eyeglasses, head mounted gear and the like, containing passive or active sensors positioned at the end of the tunnel with sensing systems positioned on and accessing a physiologic tunnel.
    Type: Grant
    Filed: April 15, 2015
    Date of Patent: August 9, 2016
    Assignee: GEELUX HOLDINGS, LTD.
    Inventor: Marcio Marc Abreu
  • Patent number: 9384736
    Abstract: Techniques disclosed herein include systems and methods for managing user interface responses to user input including spoken queries and commands. This includes providing incremental user interface (UI) response based on multiple recognition results about user input that are received with different delays. Such techniques include providing an initial response to a user at an early time, before remote recognition results are available. Systems herein can respond incrementally by initiating an initial UI response based on first recognition results, and then modify the initial UI response after receiving secondary recognition results. Since an initial response begins immediately, instead of waiting for results from all recognizers, it reduces the perceived delay by the user before complete results get rendered to the user.
    Type: Grant
    Filed: August 21, 2012
    Date of Patent: July 5, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Martin Labsky, Tomas Macek, Ladislav Kunc, Jan Kleindienst
  • Patent number: 9330720
    Abstract: Methods, systems and computer readable media for altering an audio output are provided. In some embodiments, the system may change the original frequency content of an audio data file to a second frequency content so that a recorded audio track will sound as if a different person had recorded it when it is played back. In other embodiments, the system may receive an audio data file and a voice signature, and it may apply the voice signature to the audio data file to alter the audio output of the audio data file. In that instance, the audio data file may be a textual representation of a recorded audio data file.
    Type: Grant
    Filed: April 2, 2008
    Date of Patent: May 3, 2016
    Assignee: Apple Inc.
    Inventor: Michael M. Lee
  • Patent number: 9301719
    Abstract: Support structures for positioning sensors on a physiologic tunnel for measuring physical, chemical and biological parameters of the body and to produce an action according to the measured value of the parameters. The support structure includes a sensor fitted on the support structures using a special geometry for acquiring continuous and undisturbed data on the physiology of the body. Signals are transmitted to a remote station by wireless transmission such as by electromagnetic waves, radio waves, infrared, sound and the like or by being reported locally by audio or visual transmission. The physical and chemical parameters include brain function, metabolic function, hydrodynamic function, hydration status, levels of chemical compounds in the blood, and the like. The support structure includes patches, clips, eyeglasses, head mounted gear and the like, containing passive or active sensors positioned at the end of the tunnel with sensing systems positioned on and accessing a physiologic tunnel.
    Type: Grant
    Filed: February 13, 2015
    Date of Patent: April 5, 2016
    Assignee: GEELUX HOLDING, LTD.
    Inventor: Marcio Marc Abreu
  • Patent number: 9286708
    Abstract: An information device includes an image receiving unit receiving an information terminal image having a specific region composed of pixels having a same feature value from an information terminal, the feature value being luminance or chromaticity; a specific region detecting unit detecting the specific region within the information terminal image, based on feature values of pixels composing the information terminal image received by the image receiving unit; an information device image creating unit creating an information device image related to a function provided to the information device; a composite image creating unit creating a composite image where the information device image created by the information device image creating unit is embedded in the specific region detected by the specific region detecting unit within the information terminal image; and a display control unit displaying the composite image created by the composite image creating unit on a display apparatus.
    Type: Grant
    Filed: February 10, 2015
    Date of Patent: March 15, 2016
    Assignee: JVC KENWOOD Corporation
    Inventor: Hiroaki Takanashi
  • Patent number: 9159338
    Abstract: Systems and methods of rendering a textual animation are provided. The methods include receiving an audio sample of an audio signal that is being rendered by a media rendering source. The methods also include receiving one or more descriptors for the audio signal based on at least one of a semantic vector, an audio vector, and an emotion vector. Based on the one or more descriptors, a client device may render the textual transcriptions of vocal elements of the audio signal in an animated manner. The client device may further render the textual transcriptions of the vocal elements of the audio signal to be substantially in synchrony to the audio signal being rendered by the media rendering source. In addition, the client device may further receive an identification of a song corresponding to the audio sample, and may render lyrics of the song in an animated manner.
    Type: Grant
    Filed: December 3, 2010
    Date of Patent: October 13, 2015
    Assignee: Shazam Entertainment Ltd.
    Inventors: Rahul Powar, Avery Li-Chun Wang
  • Patent number: 9076347
    Abstract: A system and methods for analyzing pronunciation, detecting errors and providing automatic feedback to help non-native speakers improve pronunciation of a foreign language is provided that employs publicly available, high accuracy third-party automatic speech recognizers available via the Internet to analyze and identify mispronunciations.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: July 7, 2015
    Assignee: Better Accent, LLC
    Inventors: Julia Komissarchik, Edward Komissarchik
  • Patent number: 9069391
    Abstract: A method for inputting a Korean character using a touch screen of a mobile device determines a vowel as a neutral vowel according to multi-touches centered around a consonant input key displayed on the touch screen. The method can minimize the number of character input keys arranged on the touch screen utilized in the mobile device, and can combine the Korean characters through the minimal touch action for inputting the Korean character.
    Type: Grant
    Filed: November 4, 2010
    Date of Patent: June 30, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Sung-Jae Hwang
  • Patent number: 9037470
    Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.
    Type: Grant
    Filed: June 25, 2014
    Date of Patent: May 19, 2015
    Assignee: West Business Solutions, LLC
    Inventors: Mark J. Pettay, Fonda J. Narke
  • Patent number: 9031828
    Abstract: Various embodiments described herein facilitate multi-lingual communications. The systems and methods of some embodiments may enable multi-lingual communications through different modes of communications including, for example, Internet-based chat, e-mail, text-based mobile phone communications, postings to online forums, postings to online social media services, and the like. Certain embodiments may implement communications systems and methods that translate text between two or more languages (e.g., spoken), while handling/accommodating for one or more of the following in the text: specialized/domain-related jargon, abbreviations, acronyms, proper nouns, common nouns, diminutives, colloquial words or phrases, and profane words or phrases.
    Type: Grant
    Filed: March 18, 2014
    Date of Patent: May 12, 2015
    Assignee: Machine Zone, Inc.
    Inventors: Gabriel Leydon, Francois Orsini, Nikhil Bojja, Shailen Karur
  • Patent number: 9026449
    Abstract: The invention relates to a communication system having a display unit (2) and a virtual being (3) that can be visually represented on the display unit (2) and that is designed for communication by means of natural speech with a natural person, wherein at least one interaction symbol (6, 7) that can be represented on the display unit (2) and by means of which the natural speech dialog between the virtual being (3) and the natural person is supported such that an achieved dialog state can be indicated and/or additional information depending on the dialog state achieved and/or information can be redundantly invoked. The invention further relates to a method for representing information of a communication between a virtual being and a natural person.
    Type: Grant
    Filed: May 15, 2009
    Date of Patent: May 5, 2015
    Assignee: Audi AG
    Inventors: Stefan Sellschopp, Valentin Nicolescu, Helmut Krcmar
  • Patent number: 8994522
    Abstract: The described method and system provide for HMI steering for a telematics-equipped vehicle based on likelihood to exceed eye glance guidelines. By determining whether a task is likely to cause the user to exceed eye glance guidelines, alternative HMI processes may be presented to a user to reduce ASGT and EORT and increase compliance with eye glance guidelines. By allowing a user to navigate through long lists of items through vocal input, T9 text input, or heuristic processing rather than through conventional presentation of the full list, a user is much more likely to comply with the eye glance guidelines. This invention is particularly useful in contexts where users may be searching for one item out of a plurality of potential items, for example, within the context of hands-free calling contacts, playing back audio files, or finding points of interest during GPS navigation.
    Type: Grant
    Filed: May 26, 2011
    Date of Patent: March 31, 2015
    Assignees: General Motors LLC, GM Global Technology Operations LLC
    Inventors: Steven C. Tengler, Bijaya Aryal, Scott P. Geisler, Michael A. Wuergler
  • Patent number: 8990093
    Abstract: Methods and arrangements for visually representing audio content in a voice application. A display is connected to a voice application, and an image is displayed on the display, the image comprising a main portion and at least one subsidiary portion, the main portion representing a contextual entity of the audio content and the at least one subsidiary portion representing at least one participatory entity of the audio content. The at least one subsidiary portion is displayed without text, and the image is changed responsive to changes in audio content in the voice application.
    Type: Grant
    Filed: August 29, 2012
    Date of Patent: March 24, 2015
    Assignee: International Business Machines Corporation
    Inventors: Amit Anil Nanavati, Nitendra Rajput
  • Patent number: 8983841
    Abstract: A network communication node includes an audio outputter that outputs an audible representation of data to be provided to a requester. The network communication node also includes a processor that determines a categorization of the data to be provided to the requester and that varies a pause between segments of the audible representation of the data in accordance with the categorization of the data to be provided to the requester.
    Type: Grant
    Filed: July 15, 2008
    Date of Patent: March 17, 2015
    Assignee: AT&T Intellectual Property, I, L.P.
    Inventors: Gregory Pulz, Steven Lewis, Charles Rajnai
  • Patent number: 8983849
    Abstract: Systems and methods for intelligent language models that can be used across multiple devices are provided. Some embodiments provide for a client-server system for integrating change events from each device running a local language processing system into a master language model. The change events can be integrated, not only into the master model, but also into each of the other local language models. As a result, some embodiments enable restoration to new devices as well as synchronization of usage across multiple devices. In addition, real-time messaging can be used on selected messages to ensure that high priority change events are updated quickly across all active devices. Using a subscription model driven by a server infrastructure, utilization logic on the client side can also drive selective language model updates.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: March 17, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Andrew Phillips, David Kay, Erland Unruh, Eric Jun Fu
  • Patent number: 8972240
    Abstract: An “Interactive Word Lattice” provides a user interface for interacting with and selecting user-modifiable paths through a lattice-based representation of alternative suggested text segments in response to a user's text segment input, such as phrases, sentences, paragraphs, entire documents, etc. More specifically, the user input is provided to a trained paraphrase generation model that returns a plurality of alternative text segments having the same or similar meaning as the original user input. An interactive graphical lattice-based representation of the alternative text segments is then presented to the user. One or more words of each alternative text segment represents a “node” of the lattice, while each connection between nodes represents a lattice “edge. Both nodes and edges are user modifiable. Each possible path through the lattice corresponds to a different alternative text segment. Users select a path through the lattice to select an alternative text to the original input.
    Type: Grant
    Filed: August 18, 2011
    Date of Patent: March 3, 2015
    Assignee: Microsoft Corporation
    Inventors: Christopher John Brockett, William Brennan Dolan
  • Patent number: 8972259
    Abstract: A method and system for teaching non-lexical speech effects includes delexicalizing a first speech segment to provide a first prosodic speech signal and data indicative of the first prosodic speech signal is stored in a computer memory. The first speech segment is audibly played to a language student and the student is prompted to recite the speech segment. The speech uttered by the student in response to the prompt, is recorded.
    Type: Grant
    Filed: September 9, 2010
    Date of Patent: March 3, 2015
    Assignee: Rosetta Stone, Ltd.
    Inventors: Joseph Tepperman, Theban Stanley, Kadri Hacioglu
  • Patent number: 8959024
    Abstract: Methods and arrangements for visually representing audio content in a voice application. A display is connected to a voice application, and an image is displayed on the display, the image comprising a main portion and at least one subsidiary portion, the main portion representing a contextual entity of the audio content and the at least one subsidiary portion representing at least one participatory entity of the audio content. The at least one subsidiary portion is displayed without text, and the image is changed responsive to changes in audio content in the voice application.
    Type: Grant
    Filed: August 24, 2011
    Date of Patent: February 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Amit Anil Nanavati, Nitendra Rajput
  • Patent number: 8949134
    Abstract: A diagnostic tool for speech recognition applications is provided, which enables a administrator to collect multiple recorded speech sessions. The administrator can then search for various failure points common to one or more of the recorded sessions in order to get a list of all sessions that have the same failure points. The invention allows the administrator to playback the session or replay any portion of the session to see the flow of the application and the recorded utterances. The invention provides the administrator with information about how to maximize the efficiency of the application which enables the administrator to edit the application to avoid future failure points.
    Type: Grant
    Filed: September 13, 2004
    Date of Patent: February 3, 2015
    Assignee: Avaya Inc.
    Inventors: Jacob Levine, John Muller, Christopher Passaretti, Wu Chingfa
  • Publication number: 20150032460
    Abstract: A terminal and speech-recognized text edit method edit the text input through writing recognition or speech recognition function efficiently. The text edit method includes displaying at least one letter input through speech recognition; detecting one of touch and speech inputs; analyzing the detected input; and performing a certain operation corresponding to the at least one letter based on the analysis result. The terminal and speech-recognized text edit method are advantageous in editing misrecognized speech-input text efficiently though finger or pen gesture-based or speech recognition-based input.
    Type: Application
    Filed: July 24, 2013
    Publication date: January 29, 2015
    Applicant: Samsung Electronics Co., Ltd
    Inventors: Sangki Kang, Kyungtae Kim
  • Patent number: 8942987
    Abstract: A clear picture of who is speaking in a setting where there are multiple input sources (e.g., a conference room with multiple microphones) can be obtained by comparing input channels against each other. The data from each channel can not only be compared, but can also be organized into portions which logically correspond to statements by a user. These statements, along with information regarding who is speaking, can be presented in a user friendly format via an interactive timeline which can be updated in real time as new audio input data is received.
    Type: Grant
    Filed: March 21, 2014
    Date of Patent: January 27, 2015
    Assignee: Jefferson Audio Video Systems, Inc.
    Inventors: Matthew David Bader, Nathan David Cole
  • Patent number: 8892442
    Abstract: Disclosed herein are systems, methods, and computer readable-media for answering a communication notification. The method for answering a communication notification comprises receiving a notification of communication from a user, converting information related to the notification to speech, outputting the information as speech to the user, and receiving from the user an instruction to accept or ignore the incoming communication associated with the notification. In one embodiment, information related to the notification comprises one or more of a telephone number, an area code, a geographic origin of the request, caller id, a voice message, address book information, a text message, an email, a subject line, an importance level, a photograph, a video clip, metadata, an IP address, or a domain name. Another embodiment involves notification assigned an importance level and repeat attempts at notification if it is of high importance.
    Type: Grant
    Filed: February 17, 2014
    Date of Patent: November 18, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Horst J. Schroeter
  • Patent number: 8886540
    Abstract: A method and system for entering information into a software application resident on a mobile communication facility is provided. The method and system may include recording speech presented by a user using a mobile communication facility resident capture facility, transmitting the recording through a wireless communication facility to a speech recognition facility, transmitting information relating to the software application to the speech recognition facility, generating results utilizing the speech recognition facility using an unstructured language model based at least in part on the information relating to the software application and the recording, transmitting the results to the mobile communications facility, loading the results into the software application and simultaneously displaying the results as a set of words and as a set of application results based on those words.
    Type: Grant
    Filed: August 1, 2008
    Date of Patent: November 11, 2014
    Assignee: Vlingo Corporation
    Inventors: Joseph P. Cerra, John N. Nguyen, Michael S. Phillips, Han Shu, Alexandra Beth Mischke
  • Patent number: 8880397
    Abstract: Exemplary embodiments provide systems, devices and methods that allow creation and management of lists of items in an integrated manner on an interactive graphical user interface. A user may speak a plurality of list items in a natural unbroken manner to provide an audio input stream into an audio input device. Exemplary embodiments may automatically process the audio input stream to convert the stream into a text output, and may process the text output into one or more n-grams that may be used as list items to populate a list on a user interface.
    Type: Grant
    Filed: October 21, 2011
    Date of Patent: November 4, 2014
    Assignee: Wal-Mart Stores, Inc.
    Inventors: Dion Almaer, Bernard Paul Cousineau, Ben Galbraith
  • Patent number: 8874443
    Abstract: Embodiments of a dialog system that employs a corpus-based approach to generate responses based on a given number of semantic constraint-value pairs are described. The system makes full use of the data from the user input to produce dialog system responses in combination with a template generator. The system primarily utilizes constraint values in order to realize efficiencies based on the more frequent tasks performed in real dialog systems although rhetorical or discourse aspects of the dialog could also be included in a similar way, that is, labeling the data with such information and performing a training process. The benefits of this system include higher quality user-aligned responses, broader coverage, faster response time, and shorter development cycles.
    Type: Grant
    Filed: August 27, 2008
    Date of Patent: October 28, 2014
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Laura Stoia, Junling Hu, Zhe Feng, Junkuo Cao
  • Patent number: 8862593
    Abstract: A system and method for creating, managing, and publishing audio microposts is provided. An audio micropost comprises a short audio segment recorded and/or captured based on voice, speech, and/or other sound, which may be shared with and/or published to subscribers and/or other users. The system may enable creating a discussion and playlist based on the audio microposts. The discussion may be generated by identifying and/or selecting an audio micropost that may pose a question and/or topic for a discussion and/or debate. The system may further enable granting the ability to participate in the discussion to a selected group of participants. The playlist of audio microposts may be created by adding individual posts into the playlist and/or by using hashtags and/or keywords to search for audio microposts of interest.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: October 14, 2014
    Assignee: Sowt International Ltd.
    Inventor: Hazem Zureiqat
  • Patent number: 8860883
    Abstract: A method and apparatus are disclosed for providing a video signature representative of a content of a video signal. A method and apparatus are further disclosed for providing an audio signature representative of a content of an audio signal. A method and apparatus for detecting lip sync are further disclosed and take advantage of the method and apparatus disclosed for providing a video signature and an audio signature.
    Type: Grant
    Filed: November 30, 2009
    Date of Patent: October 14, 2014
    Assignee: Miranda Technologies Partnership
    Inventor: Pascal Carrières
  • Patent number: 8856008
    Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.
    Type: Grant
    Filed: September 18, 2013
    Date of Patent: October 7, 2014
    Assignee: Morphism LLC
    Inventor: James H. Stephens, Jr.
  • Publication number: 20140288939
    Abstract: An approach is provided for timing application information presentation based on audio patterns. The audio platform processes and/or facilitates a processing of one or more audio samples to determine a conversational state of one or more users. Next, the audio platform determines a timing for at least one presentation of application information on a device associated with at least one of the one or more users based, at least in part, on the conversational state.
    Type: Application
    Filed: March 20, 2013
    Publication date: September 25, 2014
    Applicant: NAVTEQ B.V.
    Inventors: Jerome Beaurepaire, Philippe Beaurepaire
  • Patent number: 8843377
    Abstract: The present disclosure relates to foreign language instruction and translation methods. A system is provided which utilizes tonal and rhythm visualization components to allow a person to “see” their words as they attempt to speak a foreign language or a specific regional dialect. The system is also applicable to foreign language translation systems and allows students to improve their pronunciation by responding to visual feedback which incorporates both color and shape. The system may comprise a step-by-step instruction method, along with recording and playback features. Certain embodiments incorporate statistical analysis of student progress, remote access for teacher consultation, and video games for enhancing student interest.
    Type: Grant
    Filed: April 21, 2008
    Date of Patent: September 23, 2014
    Assignee: Master Key, LLC
    Inventor: Kenneth R. Lemons
  • Patent number: 8838454
    Abstract: A method of processing a call in a voice-command platform includes a step of transferring the call from the voice-command platform to a second voice-command platform. The method continues with the step of transmitting, either directly or indirectly, grammar information from the voice command platform to the second voice-command platform for use by a voice command application executing in the second voice-command platform in processing the call. The grammar information could be logic defining application-level grammar or system-level grammar. Alternatively, the grammar information could be a network address (e.g., URI or URL) where the grammar is stored in a file, e.g., a VXML document. The features of this invention enhance the user experience by preserving and using grammars used initially in the first voice command platform in other, downstream, voice command platforms.
    Type: Grant
    Filed: December 10, 2004
    Date of Patent: September 16, 2014
    Assignee: Sprint Spectrum L.P.
    Inventor: Balaji S. Thenthiruperai
  • Patent number: 8825478
    Abstract: Audio content is converted to text using speech recognition software. The text is then associated with a distinct voice or a generic placeholder label if no distinction can be made. From the text and voice information, a word cloud is generated based on key words and key speakers. A visualization of the cloud displays as it is being created. Words grow in size in relation to their dominance. When it is determined that the predominant words or speakers have changed, the word cloud is complete. That word cloud continues to be displayed statically and a new word cloud display begins based upon a new set of predominant words or a new predominant speaker or set of speakers. This process may continue until the meeting is concluded. At the end of the meeting, the completed visualization may be saved to a storage device, sent to selected individuals, removed, or any combination of the preceding.
    Type: Grant
    Filed: January 10, 2011
    Date of Patent: September 2, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Susan Marie Cox, Janani Janakiraman, Fang Lu, Loulwa F Salem
  • Patent number: 8825484
    Abstract: A character input apparatus which makes it possible to suppress degradation of use-friendliness in a case where a visually disabled user inputs characters using an auto-complete function. In the character string input apparatus, a character string to be input as a portion following a character string input by a user is predicted based on the character string input by the user, and the character string input by the user is completed using the predicted character string as a portion complementary thereto. In a voice guidance mode, information associated with a key selected by the user is read aloud by voice. When the voice guidance mode is enabled, the character string input apparatus disables the auto-complete function and performs control such that a character string cannot be automatically completed.
    Type: Grant
    Filed: September 23, 2011
    Date of Patent: September 2, 2014
    Assignee: Canon Kabushiki Kaisha
    Inventors: Masayuki Yamada, Masayuki Sato
  • Patent number: 8798999
    Abstract: A dialog design tool includes a dialog manager including a system prompt generator and a response generator which allows a dialog designer to generate at least one response; a dialog synthesizer for structurally managing the input and response; and an output and display unit for outputting and displaying at least one dialog structure. At least one system prompt and at least one response are included in each state, and a linking unit may link a first state to a second state related to the first state, link the second state to a third state, and so on until certain system actions can be achieved. A dialog synthesizer includes a loop detecting unit which detects and identifies loops in the dialog structure. Thus, the dialog design tool facilitates the creation of natural language dialogs by creating data structures for voice user interfaces.
    Type: Grant
    Filed: November 1, 2011
    Date of Patent: August 5, 2014
    Assignee: Alpine Electronics, Inc.
    Inventors: Inci Ozkaragoz, Yan Wang, Benjamin Ao
  • Patent number: 8798995
    Abstract: Topics of potential interest to a user, useful for purposes such as targeted advertising and product recommendations, can be extracted from voice content produced by a user. A computing device can capture voice content, such as when a user speaks into or near the device. One or more sniffer algorithms or processes can attempt to identify trigger words in the voice content, which can indicate a level of interest of the user. For each identified potential trigger word, the device can capture adjacent audio that can be analyzed, on the device or remotely, to attempt to determine one or more keywords associated with that trigger word. The identified keywords can be stored and/or transmitted to an appropriate location accessible to entities such as advertisers or content providers who can use the keywords to attempt to select or customize content that is likely relevant to the user.
    Type: Grant
    Filed: September 23, 2011
    Date of Patent: August 5, 2014
    Assignee: Amazon Technologies, Inc.
    Inventor: Kiran K. Edara
  • Patent number: 8793129
    Abstract: It is an object of the present invention to make an act of viewing an image interactive and further enriched. A microphone 18 inputs a voice signal of a voice uttered by a viewer who is viewing a display image displayed on a display portion 17, and causes the voice signal to be stored in a buffer 19. A voice recognition portion 20 identifies at least one word from the voice uttered by the viewer based on the voice signal, and acquires them as a keyword. A counter 21 calculates the number of incidences of the keyword. A display driver 16 causes information including a keyword having a number of incidences that exceeds a threshold value or information derived from the keyword to be displayed together with the display image displayed on the display portion 17.
    Type: Grant
    Filed: September 23, 2010
    Date of Patent: July 29, 2014
    Assignee: Casio Computer Co., Ltd.
    Inventors: Tetsuya Handa, Kimiyasu Mizuno, Takehiro Aibara, Hitoshi Amagai, Naotaka Uehara, Takayuki Kogane, Sumito Shinohara, Masato Nunokawa
  • Publication number: 20140201634
    Abstract: A computer-implemented system and method of determining a color palette to be associated with an audio selection is presented. The system and method includes receiving an audio selection, sending data associated with the audio selection to a music information server and receiving information about the audio selection, sending the information to a color palette server that selects one or more color palettes based on the information, receiving the color palettes from the color palette server, associating one of the color palettes with the audio selection, and sending the associated color palette and audio selection to one or more third parties using a mode of social media.
    Type: Application
    Filed: January 16, 2014
    Publication date: July 17, 2014
    Applicant: MARCUS THOMAS LLC
    Inventors: King Hill, Mark Bachmann, Jamie Venorsky, Jason Hutchison, Kevin Delsanter, Carolyn Fertig, Kara Gildone, Brian Klausner, Scott Chapin
  • Publication number: 20140191976
    Abstract: Various embodiments provide an interactive, shared, story-reading experience in which stories can be experienced from remote locations. Various embodiments enable augmentation or modification of audio and/or video associated with the story-reading experience. This can include augmentation and modification of a reader's voice, face, and/or other content associated with the story as the story is read.
    Type: Application
    Filed: January 7, 2013
    Publication date: July 10, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Alan W. Peevers, John C. Tang, Nizamettin Gok, Gina Danielle Venolia, Kori Inkpen Quinn, Nitin Khanna, Simon Andrew Longbottom, Kurt A. Thywissen
  • Patent number: 8775180
    Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.
    Type: Grant
    Filed: November 26, 2012
    Date of Patent: July 8, 2014
    Assignee: West Corporation
    Inventors: Mark J. Pettay, Fonda J. Narke
  • Patent number: 8768703
    Abstract: Methods and apparatus to present a video program to a visually impaired person are disclosed. An example method comprises detecting a text portion of a media stream including a video stream, the text portion not being consumable by a blind person, retrieving text associated with the text portion of the media stream, and converting the text to a first audio stream based on a first type of a first program in the media stream, and converting the text to a second audio stream based on a second type of a second program in the media stream.
    Type: Grant
    Filed: July 19, 2012
    Date of Patent: July 1, 2014
    Assignee: AT&T Intellectual Property, I, L.P.
    Inventors: Hisao M. Chang, Horst Schroeter
  • Publication number: 20140172432
    Abstract: A transmissive display device includes an image display section adapted to generate image light representing an image, allow a user to visually recognize the image light, and transmit an external sight, a sound acquisition section adapted to obtain a sound, a conversion section adapted to convert the sound into a character image expressing the sound as an image using characters, a specific direction setting section adapted to set a specific direction, and a display position setting section adapted to set an image display position, which is a position where character image light representing the character image is made to be visually recognized in a visual field of the user, based on the specific direction.
    Type: Application
    Filed: December 9, 2013
    Publication date: June 19, 2014
    Applicant: Seiko Epson Corporation
    Inventor: Kaori Sendai
  • Publication number: 20140172425
    Abstract: A system and method of creating a customized multi-media message to a recipient is disclosed. The multi-media message is created by a sender and contains an animated entity that delivers an audible message. The sender chooses the animated entity from a plurality of animated entities. The system receives a text message from the sender and receives a sender audio message associated with the text message. The sender audio message is associated with the chosen animated entity to create the multi-media message. The multi-media message is delivered by the animated entity using as the voice the sender audio message wherein the mouth movements of the animated entity conform to the sender audio message.
    Type: Application
    Filed: August 27, 2013
    Publication date: June 19, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Joern OSTERMANN, Mehmet Reha CIVANLAR, Barbara Buda, Claudio Lande
  • Patent number: 8756057
    Abstract: A speech analysis system and method for analyzing speech. The system includes: a voice recognition system for converting inputted speech to text; an analytics system for generating feedback information by analyzing the inputted speech and text; and a feedback system for outputting the feedback information.
    Type: Grant
    Filed: November 2, 2005
    Date of Patent: June 17, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Steven Michael Miller, Anne R. Sand
  • Patent number: 8751228
    Abstract: Embodiments of an audio-to-video engine are disclosed. In operation, the audio-to-video engine generates facial movement (e.g., a virtual talking head) based on an input speech. The audio-to-video engine receives the input speech and recognizes the input speech as a source feature vector. The audio-to-video engine then determines a Maximum A Posterior (MAP) mixture sequence based on the source feature vector. The MAP mixture sequence may be a function of a refined Gaussian Mixture Model (GMM). The audio-to-video engine may then use the MAP to estimate video feature parameters. The video feature parameters are then interpreted as facial movement. The facial movement may be stored as data to a storage module and/or it may be displayed as video to a display device.
    Type: Grant
    Filed: November 4, 2010
    Date of Patent: June 10, 2014
    Assignee: Microsoft Corporation
    Inventors: Lijuan Wang, Frank Kao-Ping Soong
  • Patent number: 8744856
    Abstract: A computer implemented method, system and computer program product for evaluating pronunciation. Known phonemes are stored in a computer memory. A spoken utterance corresponding to a target utterance, comprised of a sequence of target phonemes, is received and stored in a computer memory. The spoken utterance is segmented into a sequence of spoken phonemes, each corresponding to a target phoneme. For each spoken phoneme, a relative posterior probability is calculated that the spoken phoneme is the corresponding target phoneme. If the calculated probability is greater than a first threshold, an indication that the target phoneme was pronounced correctly is output; if less than a first threshold, an indication that the target phoneme was pronounced incorrectly is output. If the probability is less than a first threshold and greater than a second threshold, an indication that pronunciation of the target phoneme was acceptable is output.
    Type: Grant
    Filed: February 21, 2012
    Date of Patent: June 3, 2014
    Assignee: Carnegie Speech Company
    Inventor: Mosur K. Ravishankar
  • Publication number: 20140142954
    Abstract: A soundtrack creation method and user playback system for soundtracks synchronized to electronic text. Synchronization is achieved by maintaining a reading speed variable indicative of the user's reading speed. The system provides for multiple channels of audio to enable concurrent playback of two or more partially or entirely overlapping audio regions so as to create an audio output having, for example, sound effects, ambience, music or other audio features that are triggered to playback at specific portions in the electronic text to enhance the reading experience.
    Type: Application
    Filed: January 28, 2014
    Publication date: May 22, 2014
    Applicant: BOOKTRACK HOLDINGS LIMITED
    Inventors: PAUL CHARLES CAMERON, MARK STEVEN CAMERON, RUI ZHANG, ANDREW RUSSELL DAVENPORT, PAUL ANTHONY MCGRATH
  • Patent number: 8731943
    Abstract: Systems, methods and computer program products are provided for translating a natural language into music. Through systematic parsing, music compositions can be created. These compositions can be created by one or more persons who do not speak the same natural language.
    Type: Grant
    Filed: February 5, 2010
    Date of Patent: May 20, 2014
    Assignee: Little Wing World LLC
    Inventors: Nicolle Ruetz, David Warhol
  • Patent number: 8725518
    Abstract: A system for providing automatic quality management regarding a level of conformity to a specific accent, including, a recording system, a statistical model database with statistical models representing speech data of different levels of conformity to a specific accent, a speech analysis system, a quality management system. Wherein the recording system is adapted to record one or more samples of a speakers speech and provide it to the speech analysis system for analysis, and wherein the speech analysis system is adapted to provide a score of the speakers speech samples to the quality management system by analyzing the recorded speech samples relative to the statistical models in the statistical model database.
    Type: Grant
    Filed: April 25, 2006
    Date of Patent: May 13, 2014
    Assignee: Nice Systems Ltd.
    Inventors: Moshe Waserblat, Barak Eilam