Pattern Display Patents (Class 704/276)
  • Publication number: 20140129235
    Abstract: Apparatus comprising a receiver configured to receive a first audio signal, a signal characteriser configured to determine at least one characteristic associated with the first audio signal, a comparator configured to compare the at least one characteristic against at least one characteristic associated with at least one further audio signal, and a display configured to display the at least one characteristic associated with at least one further audio signal dependent on the first audio signal characteristic.
    Type: Application
    Filed: June 17, 2011
    Publication date: May 8, 2014
    Applicant: Nokia Corporation
    Inventor: Mikko Veli Aimo Suvanto
  • Patent number: 8719038
    Abstract: Computerized apparatus for obtaining and displaying information, such as for example directions to a desired entity or organization. In one embodiment, the computerized apparatus is configured to receive user speech input and enable performance of various tasks, such as obtaining desired information relating to indoor entities, maps or directions, or any number of other topics. The obtained data may also, in various variants, be displayed in various formats and relative to other entities nearby.
    Type: Grant
    Filed: January 28, 2013
    Date of Patent: May 6, 2014
    Assignee: West View Research, LLC
    Inventor: Robert F. Gazdzinski
  • Patent number: 8719032
    Abstract: A clear picture of who is speaking in a setting where there are multiple input sources (e.g., a conference room with multiple microphones) can be obtained by comparing input channels against each other. The data from each channel can not only be compared, but can also be organized into portions which logically correspond to statements by a user. These statements, along with information regarding who is speaking, can be presented in a user friendly format via an interactive timeline which can be updated in real time as new audio input data is received.
    Type: Grant
    Filed: December 11, 2013
    Date of Patent: May 6, 2014
    Assignee: Jefferson Audio Video Systems, Inc.
    Inventors: Matthew David Bader, Nathan David Cole
  • Patent number: 8706494
    Abstract: Methods and systems for providing a network-accessible text-to-speech synthesis service are provided. The service accepts content as input. After extracting textual content from the input content, the service transforms the content into a format suitable for high-quality speech synthesis. Additionally, the service produces audible advertisements, which are combined with the synthesized speech. The audible advertisements themselves can be generated from textual advertisement content.
    Type: Grant
    Filed: August 29, 2011
    Date of Patent: April 22, 2014
    Assignee: Aeromee Development L.L.C.
    Inventor: James H. Stephens, Jr.
  • Patent number: 8707381
    Abstract: A synchronization process between captioning data and/or corresponding metatags and the associated media file parses the media file, correlates the caption information and/or metatags with segments of the media file, and provides a capability for textual search and selection of particular segments. A time-synchronized version of the captions is created that is synchronized to the moment that the speech is uttered in the recorded media. The caption data is leveraged to enable search engines to index not merely the title of a video, but the entirety of what was said during the video as well as any associated metatags relating to contents of the video. Further, because the entire media file is indexed, a search can request a particular scene or occurrence within the event recorded by the media file, and the exact moment within the media relevant to the search can be accessed and played for the requester.
    Type: Grant
    Filed: September 21, 2010
    Date of Patent: April 22, 2014
    Assignee: Caption Colorado L.L.C.
    Inventors: Richard T. Polumbus, Michael W. Homyack
  • Patent number: 8706485
    Abstract: The present invention pertains to method and a communication device (100) for associating a contact record pertaining to a remote speaker (220) with a mnemonic image (191) based on attributes of the speaker (220). The method comprises receiving voice data of the speaker (220); in a communication session with a source device (200). A source determination representing the speaker (220) is registered, and then the received voice data is analyzed so that voice data characteristics can be extracted. Based on these voice data characteristics a mnemonic image (191) can be selected, and associated to a contact record in which the source determination is stored. The mnemonic image (191) may be selected among images previously stored in the device, or derived through editing of such images.
    Type: Grant
    Filed: May 17, 2011
    Date of Patent: April 22, 2014
    Assignees: Sony Corporation, Sony Mobile Communications AB
    Inventor: Joakim Martensson
  • Patent number: 8706495
    Abstract: A speech recognition device (1) processes speech data (SD) of a dictation and thus establishes recognized text information (ETI) and link information (LI) of the dictation. In a synchronous playback mode of the speech recognition device (1), during the acoustic playback of the dictation a correction device (10) synchronously marks the word of the recognized text information (ETI) which word relates to the speech data (SD) just played back marked by the link information (LI) is marked synchronously, the just marked word featuring the position of an audio cursor (AC). When a user of the speech recognition device (1) recognizes an incorrect word, he positions a text cursor (TC) at the incorrect word and corrects it.
    Type: Grant
    Filed: January 17, 2013
    Date of Patent: April 22, 2014
    Assignee: Nuance Communications, Inc.
    Inventor: Wolfgang Gschwendtner
  • Patent number: 8700403
    Abstract: A method of statistical modeling is provided which includes constructing a statistical model and incorporating Gaussian priors during feature selection and during parameter optimization for the construction of the statistical model.
    Type: Grant
    Filed: November 3, 2005
    Date of Patent: April 15, 2014
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Lin Zhao
  • Patent number: 8682672
    Abstract: A system and method is described that permits synchronization of a transcript with an audio/video stream of a webcast. The system also permits a user to perform a search of the transcript and then to jump in the webcast audio/video stream to the point identified during the search.
    Type: Grant
    Filed: September 17, 2004
    Date of Patent: March 25, 2014
    Assignee: ON24, Inc.
    Inventors: Tommy Ha, Kamalaksha Ghosh
  • Patent number: 8674996
    Abstract: A system for controlling a rendering engine by using specialized commands. The commands are used to generate a production, such as a television show, at an end-user's computer that executes the rendering engine. In one embodiment, the commands are sent over a network, such as the Internet, to achieve broadcasts of video programs at very high compression and efficiency. Commands for setting and moving camera viewpoints, animating characters, and defining or controlling scenes and sounds are described. At a fine level of control math models and coordinate systems can be used make specifications. At a coarse level of control the command language approaches the text format traditionally used in television or movie scripts. Simple names for objects within a scene are used to identify items, directions and paths. Commands are further simplified by having the rendering engine use defaults when specifications are left out.
    Type: Grant
    Filed: October 27, 2008
    Date of Patent: March 18, 2014
    Assignee: Quonsil PL. 3, LLC
    Inventor: Charles J. Kulas
  • Patent number: 8676590
    Abstract: A computer-implemented technique for transcribing audio data includes generating, along a vertical axis on a display of a client device, an image representing audio content. The technique further includes receiving, from a user of the client device, a selection of a portion of the image; and generating, via an audio module of the client device, an audio output corresponding to the selected portion of the image. The technique further includes receiving, from the user, a selection indicating a position along the vertical axis on the display to enter a text portion representing the audio output, wherein the position is aligned to the selected portion of the image. The technique further includes receiving, from the user, the text portion representing the audio output; and displaying, on the display, the text portion at the position, wherein the text portion extends along a horizontal axis on the display.
    Type: Grant
    Filed: September 26, 2012
    Date of Patent: March 18, 2014
    Assignee: Google Inc.
    Inventors: Jeffrey Scott Sorensen, Masayuki Nanzawa, Ravindran Rajakumar
  • Patent number: 8666749
    Abstract: The disclosure includes a system and method for generating audio snippets from a subset of audio tracks. In some embodiments an audio snippet is an audio summary of a group or collection of songs.
    Type: Grant
    Filed: January 17, 2013
    Date of Patent: March 4, 2014
    Assignee: Google Inc.
    Inventors: Amarnag Subramanya, Jennifer Gillenwater, Garth Griffin, Fernando Pereira, Douglas Eck
  • Patent number: 8655667
    Abstract: A software and/or hardware facility for inferring user context and delivering advertisements, such as coupons, using natural language and/or sentiment analysis is disclosed. The facility may infer context information based on a user's emotional state, attitude, needs, or intent from the user's interaction with or through a mobile device. The facility may then determine whether it is appropriate to deliver an advertisement to the user and select an advertisement for delivery. The facility may also determine an appropriate expiration time and/or discount amount for the advertisement.
    Type: Grant
    Filed: November 19, 2012
    Date of Patent: February 18, 2014
    Assignee: Microsoft Corporation
    Inventors: Raman Chandrasekar, Eric I-Chao Chang, Michael Tsang, Tian Bai
  • Patent number: 8655662
    Abstract: Disclosed herein are systems, methods, and computer readable-media for answering a communication notification. The method for answering a communication notification comprises receiving a notification of communication from a user, converting information related to the notification to speech, outputting the information as speech to the user, and receiving from the user an instruction to accept or ignore the incoming communication associated with the notification. In one embodiment, information related to the notification comprises one or more of a telephone number, an area code, a geographic origin of the request, caller id, a voice message, address book information, a text message, an email, a subject line, an importance level, a photograph, a video clip, metadata, an IP address, or a domain name. Another embodiment involves notification assigned an importance level and repeat attempts at notification if it is of high importance.
    Type: Grant
    Filed: November 29, 2012
    Date of Patent: February 18, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Horst Schroeter
  • Patent number: 8639512
    Abstract: A computer-implemented system and method for evaluating the performance of a user using a dictation system is provided. The system and method include receiving a text or transcription file generated from user audio. A performance metric, such as, for example, words/minute or errors is generated based on the transcription file. The performance metric is provided to an administrator so the administrator can evaluate the performance of the user using the dictation system.
    Type: Grant
    Filed: April 21, 2009
    Date of Patent: January 28, 2014
    Assignee: nVoq Incorporated
    Inventors: Brian Marquette, Charles Corfield, Todd Espy
  • Patent number: 8634708
    Abstract: The invention relates to a method for creating a new roundup of an audiovisual document previously recorded in a device. The document contains two parts, one being the roundup and the other composed of a plurality of reports. The roundup is itself divided into a plurality of parts. The device first searches for the associations between the roundup parts and the reports, and detects the reports that are not associated with roundup parts. Then, summaries are created for the reports not associated with the roundup, and incorporated into the initial roundup to create a new roundup. In this manner, the user can easily select any report from the roundup part associated with this report. The invention also relates to the receiver suitable for implementing the method.
    Type: Grant
    Filed: December 20, 2007
    Date of Patent: January 21, 2014
    Assignee: Thomson Licensing
    Inventors: Louis Chevallier, Claire-Helene Demarty, Lionel Oisel
  • Patent number: 8635075
    Abstract: A system is configured to enable a user to assert voice-activated commands. When the user issues a non-ambiguous command, the system activates a corresponding control. The area of activity on the user interface is visually highlighted to emphasize to the user that what they spoke caused an action. In one specific embodiment, the highlighting involves floating text the user uttered to a visible user interface component.
    Type: Grant
    Filed: October 12, 2009
    Date of Patent: January 21, 2014
    Assignee: Microsoft Corporation
    Inventor: Felix Andrew
  • Patent number: 8630860
    Abstract: Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results.
    Type: Grant
    Filed: March 3, 2011
    Date of Patent: January 14, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Shilei Zhang, Shenghua Bao, Wen Liu, Yong Qin, Zhiwei Shuang, Jian Chen, Zhong Su, Qin Shi, William F. Ganong, III
  • Publication number: 20140012586
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.
    Type: Application
    Filed: August 6, 2012
    Publication date: January 9, 2014
    Applicant: GOOGLE INC.
    Inventors: Andrew E. Rubin, Johan Schalkwyk, Maria Carolina Parada San Martin
  • Patent number: 8626493
    Abstract: Sounds are inserted into audio content according to a pattern. A library stores humanly perceptible voice sounds. Pattern control information is received that is associated with a device recording the audio content. A pattern is retrieved and washing machine sounds are inserted into the audio content according to the pattern. The humanly perceptible voice sounds are inserted into the audio content according to the pattern to generate a signed audio recording.
    Type: Grant
    Filed: April 26, 2013
    Date of Patent: January 7, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Steven N. Tischer
  • Patent number: 8612228
    Abstract: A section corresponding to a given duration is sampled from sound data that indicates the voice of a player collected by a microphone, and a vocal tract cross-sectional area function of the sampled section is calculated. The vertical dimension of the mouth is calculated from a throat-side average cross-sectional area of the vocal tract cross-sectional area function, and the area of the mouth is calculated from a mouth-side average cross-sectional area. The transverse dimension of the mouth is calculated from the area of the mouth and the vertical dimension of the mouth.
    Type: Grant
    Filed: March 26, 2010
    Date of Patent: December 17, 2013
    Assignee: Namco Bandai Games Inc.
    Inventor: Hiroyuki Hiraishi
  • Patent number: 8600763
    Abstract: Whenever an event occurs on a computing system which will accept a response from a user of the system, the system automatically determines whether or not to enable speech interaction with the system for the event response. Whenever speech interaction is enabled with the system for the event response, the system provides a notification to the user which informs the user of the event and their options for responding thereto, where these options include responding verbally. Whenever the user responds within a prescribed period of time via a voice command (VC), the system attempts to recognize the VC. Whenever the VC is successfully recognized, the system responds appropriately to the VC.
    Type: Grant
    Filed: June 4, 2010
    Date of Patent: December 3, 2013
    Assignee: Microsoft Corporation
    Inventors: Alice Jane Bernheim Brush, Paul Johns, Jen Anderson, Connie Missimer, Seung Yang, Jean Ku
  • Patent number: 8588378
    Abstract: A computer-implemented voice mail method includes obtaining an electronic audio file of a recorded user message directed to a telephone user, automatically generating a transcript of the recorded user message, and identifying locations in the transcript in coordination with timestamps in the recorded user message so that successive portions of the transcript can be highlighted in coordination with playing of the recorded user message. The method also include identifying one or characteristics of the message using meta data relating to the recorded user message, and storing the recorded user message and information about the identified locations of the recorded user message.
    Type: Grant
    Filed: July 15, 2010
    Date of Patent: November 19, 2013
    Assignee: Google Inc.
    Inventors: Benedict Davies, Christian Brunschen
  • Patent number: 8583434
    Abstract: Computer-implemented methods and apparatus are provided to facilitate the recognition of the content of a body of speech data. In one embodiment, a method for analyzing verbal communication is provided, comprising acts of producing an electronic recording of a plurality of spoken words; processing the electronic recording to identify a plurality of word alternatives for each of the spoken words, each of the plurality of word alternatives being identified by comparing a portion of the electronic recording with a lexicon, and each of the plurality of word alternatives being assigned a probability of correctly identifying a spoken word; loading the word alternatives and the probabilities to a database for subsequent analysis; and examining the word alternatives and the probabilities to determine at least one characteristic of the plurality of spoken words.
    Type: Grant
    Filed: January 29, 2008
    Date of Patent: November 12, 2013
    Assignee: CallMiner, Inc.
    Inventor: Jeffrey A. Gallino
  • Patent number: 8560317
    Abstract: A vocabulary dictionary storing unit for storing a plurality of words in advance, a vocabulary dictionary managing unit for extracting recognition target words, a matching unit for calculating a degree of matching with the recognition target words based on an accepted voice, a result output unit for outputting, as a recognition result, a word having a best score from a result of calculating the degree of matching, and an extraction criterion information managing unit for changing extraction criterion information according to a result of monitoring by a monitor control unit are provided. The vocabulary dictionary storing unit further includes a scale information storing unit for storing scale information serving as a scale at the time of extracting the recognition target words, and an extraction criterion information storing unit for storing extraction criterion information indicating a criterion of the recognition target words at the time of extracting the recognition target words.
    Type: Grant
    Filed: September 18, 2006
    Date of Patent: October 15, 2013
    Assignee: Fujitsu Limited
    Inventor: Kenji Abe
  • Patent number: 8554566
    Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.
    Type: Grant
    Filed: November 29, 2012
    Date of Patent: October 8, 2013
    Assignee: Morphism LLC
    Inventor: James H. Stephens, Jr.
  • Patent number: 8527279
    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving geographical information derived from a non-verbal user action associated with a first computing device. The non-verbal user action implies an interest of a user in a geographic location. The method also includes identifying a grammar associated with the geographic location using the derived geographical information and outputting a grammar indicator for use in selecting the identified grammar for voice recognition processing of vocal input from the user.
    Type: Grant
    Filed: August 23, 2012
    Date of Patent: September 3, 2013
    Assignee: Google Inc.
    Inventors: David P. Singleton, Debajit Ghosh
  • Publication number: 20130226593
    Abstract: An apparatus comprising: an audio source determiner configured to determine at least one audio source; a visualizer configured to generate a visual representation associated with the at least one audio source; and a controller configured to process an audio signal associated with the at least one audio source dependent on interaction with the visual representation.
    Type: Application
    Filed: November 12, 2010
    Publication date: August 29, 2013
    Applicant: Nokia Corporation
    Inventors: Birgir Magnusson, Koray Ozcan
  • Patent number: 8515753
    Abstract: The example embodiment of the present invention provides an acoustic model adaptation method for enhancing recognition performance for a non-native speaker's speech. In order to adapt acoustic models, first, pronunciation variations are examined by analyzing a non-native speaker's speech. Thereafter, based on variation pronunciation of a non-native speaker's speech, acoustic models are adapted in a state-tying step during a training process of acoustic models. When the present invention for adapting acoustic models and a conventional acoustic model adaptation scheme are combined, more-enhanced recognition performance can be obtained. The example embodiment of the present invention enhances recognition performance for a non-native speaker's speech while reducing the degradation of recognition performance for a native speaker's speech.
    Type: Grant
    Filed: March 30, 2007
    Date of Patent: August 20, 2013
    Assignee: Gwangju Institute of Science and Technology
    Inventors: Hong Kook Kim, Yoo Rhee Oh, Jae Sam Yoon
  • Patent number: 8494852
    Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.
    Type: Grant
    Filed: October 27, 2010
    Date of Patent: July 23, 2013
    Assignee: Google Inc.
    Inventors: Michael J. LeBeau, William J. Byrne, John Nicholas Jitkoff, Brandon M. Ballinger, Trausti Kristjansson
  • Patent number: 8494668
    Abstract: Character value of a sound signal is extracted for each unit portion, and degrees of similarity between the character values of the individual unit portions are calculated and arranged in a matrix configuration. The matrix has arranged in each column the degrees of similarity acquired by comparing, for each of the unit portions, the sound signal and a delayed sound signal obtained by delaying the sound signal by a time difference equal to an integral multiple of a time length of the unit portion, and it has a plurality of the columns in association with different time differences. Repetition probability is calculated for each of the columns corresponding to the different time differences in the matrix. A plurality of peaks in a distribution of the repetition probabilities are identified. The loop region in the sound signal is identified by collating a reference matrix with the degree of similarity matrix.
    Type: Grant
    Filed: February 19, 2009
    Date of Patent: July 23, 2013
    Assignee: Yamaha Corporation
    Inventors: Bee Suan Ong, Sebastian Streich, Takuya Fujishima, Keita Arimoto
  • Patent number: 8484034
    Abstract: A first party creates and edits a phonetic-alphabet representation of its name. The phonetic representation is conveyed to a second party as “caller-identification” information by messages that set up a call between the parties. The phonetic representation of the name is displayed to the second party, converted to speech, and/or converted to an alphabet of a language of the second party and then displayed to the second party.
    Type: Grant
    Filed: March 31, 2008
    Date of Patent: July 9, 2013
    Assignee: Avaya Inc.
    Inventors: Paul Roller Michaelis, David Mohler, Charles Wrobel
  • Patent number: 8478590
    Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: July 2, 2013
    Assignee: Google Inc.
    Inventors: Michael J. LeBeau, William J. Byrne, John Nicholas Jitkoff, Brandon M. Ballinger, Trausti Kristjansson
  • Patent number: 8452604
    Abstract: Recognizable visual and/or audio artifacts, such as recognizable sounds, are introduced into visual and/or audio content in an identifying pattern to generate a signed visual and/or audio recording for distribution over a digital communications medium. A library of images and/or sounds may be provided, and the image and/or sounds from the library may be selectively inserted to generate the identifying pattern. The images and/or sounds may be inserted responsive to one or more parameters associated with creation of the visual and/or audio content. A representation of the identifying pattern may be generated and stored in a repository, e.g., an independent repository configured to maintain creative rights information. The stored pattern may be retrieved from the repository and compared to an unidentified visual and/or audio recording to determine an identity thereof.
    Type: Grant
    Filed: August 15, 2005
    Date of Patent: May 28, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Steven Tischer
  • Publication number: 20130124202
    Abstract: Provided in some embodiments is a method including receiving ordered script words are indicative of dialogue words to be spoken, receiving audio data corresponding to at least a portion of the dialogue words to be spoken and including timecodes associated with dialogue words, generating a matrix of the ordered script words versus the dialogue words, aligning the matrix to determine hard alignment points that include matching consecutive sequences of ordered script words with corresponding sequences of dialogue words, partitioning the matrix of ordered script words into sub-matrices bounded by adjacent hard-alignment points and including corresponding sub-sets the script and dialogue words between the hard-alignment points, and aligning each of the sub-matrices.
    Type: Application
    Filed: May 28, 2010
    Publication date: May 16, 2013
    Inventor: Walter W. Chang
  • Publication number: 20130124212
    Abstract: A method includes receiving script data including script words for dialogue, receiving audio data corresponding to at least a portion of the dialogue, wherein the audio data includes timecodes associated with dialogue words, generating a sequential alignment of the script words to the dialogue words, matching at least some of the script words to corresponding dialogue words to determine hard alignment points, partitioning the sequential alignment of script words into alignment sub-sets, wherein the bounds of the alignment sub-subsets are defined by adjacent hard-alignment points, and wherein the alignment subsets includes a sub-set of the script words and a corresponding sub-set of dialogue words that occur between the hard-alignment points, determining corresponding timecodes for a sub-set of script words in a sub-subset based on the timecodes associated with the sub-set of dialogue words, and generating time-aligned script data including the sub-set of script words and their corresponding timecodes.
    Type: Application
    Filed: May 28, 2010
    Publication date: May 16, 2013
    Inventors: Jerry R. Scoggins, II, Walter W. Chang, David A. Kuspa, Charles E. Van Winkle, Simon R. Hayhurst
  • Publication number: 20130124213
    Abstract: Provided in some embodiments is a computer implemented method that includes providing script data including script words indicative of dialogue words to be spoken, providing audio data corresponding to at least a portion of the dialogue words to be spoken, wherein the audio data includes timecodes associated with dialogue words, generating a sequential alignment of the script words to the dialogue words, matching at least some of the script words to corresponding dialogue words to determine alignment points, determining corresponding timecodes for unmatched script words using interpolation based on the timecodes associated with matching script words, and generating time-aligned script data including the script words and their corresponding time codes.
    Type: Application
    Filed: May 28, 2010
    Publication date: May 16, 2013
    Inventors: Jerry R. Scoggins, II, Walter W. Chang
  • Patent number: 8412531
    Abstract: The present invention provides a user interface for providing press-to-talk-interaction via utilization of a touch-anywhere-to-speak module on a mobile computing device. Upon receiving an indication of a touch anywhere on the screen of a touch screen interface, the touch-anywhere-to-speak module activates the listening mechanism of a speech recognition module to accept audible user input and displays dynamic visual feedback of a measured sound level of the received audible input. The touch-anywhere-to-speak module may also provide a user a convenient and more accurate speech recognition experience by utilizing and applying the data relative to a context of the touch (e.g., relative location on the visual interface) in correlation with the spoken audible input.
    Type: Grant
    Filed: June 10, 2009
    Date of Patent: April 2, 2013
    Assignee: Microsoft Corporation
    Inventors: Anne K. Sullivan, Lisa Stifelman, Kathleen J. Lee, Su Chuin Leong
  • Patent number: 8390669
    Abstract: The present disclosure discloses a method for identifying individuals in a multimedia stream originating from a video conferencing terminal or a Multipoint Control Unit, including executing a face detection process on the multimedia stream; defining subsets including facial images of one or more individuals, where the subsets are ranked according to a probability that their respective one or more individuals will appear in a video stream; comparing a detected face to the subsets in consecutive order starting with a most probable subset, until a match is found; and storing an identity of the detected face as searchable metadata in a content database in response to the detected face matching a facial image in one of the subsets.
    Type: Grant
    Filed: December 15, 2009
    Date of Patent: March 5, 2013
    Assignee: Cisco Technology, Inc.
    Inventors: Jason Catchpole, Craig Cockerton
  • Patent number: 8392199
    Abstract: A clipping detection device calculates an amplitude distribution of an input signal for each predetermined period, calculates a deflection degree of the distribution on the basis of the calculated amplitude distribution, and then detects clipping of a communication signal on the basis of the calculated deflection degree of the distribution.
    Type: Grant
    Filed: May 21, 2009
    Date of Patent: March 5, 2013
    Assignee: Fujitsu Limited
    Inventors: Takeshi Otani, Masakiyo Tanaka, Yasuji Ota, Shusaku Ito
  • Publication number: 20130054249
    Abstract: Methods and arrangements for visually representing audio content in a voice application. A display is connected to a voice application, and an image is displayed on the display, the image comprising a main portion and at least one subsidiary portion, the main portion representing a contextual entity of the audio content and the at least one subsidiary portion representing at least one participatory entity of the audio content. The at least one subsidiary portion is displayed without text, and the image is changed responsive to changes in audio content in the voice application.
    Type: Application
    Filed: August 24, 2011
    Publication date: February 28, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Amit Anil Nanavati, Nitendra Rajput
  • Publication number: 20130054250
    Abstract: Methods and arrangements for visually representing audio content in a voice application. A display is connected to a voice application, and an image is displayed on the display, the image comprising a main portion and at least one subsidiary portion, the main portion representing a contextual entity of the audio content and the at least one subsidiary portion representing at least one participatory entity of the audio content. The at least one subsidiary portion is displayed without text, and the image is changed responsive to changes in audio content in the voice application.
    Type: Application
    Filed: August 29, 2012
    Publication date: February 28, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Amit Anil Nanavati, Nitendra Rajput
  • Patent number: 8380509
    Abstract: A speech recognition device (1) processes speech data (SD) of a dictation and establishes recognized text information (ETI) and link information (LI) of the dictation. In a synchronous playback mode of the speech recognition device (1), during acoustic playback of the dictation a correction device (10) synchronously marks the word of the recognized text information (ETI) which word relates to speech data (SD) just played back marked by link information (LI) is marked synchronously, the just marked word featuring the position of an audio cursor (AC). When a user of the speech recognition device (1) recognizes an incorrect word, he positions a text cursor (TC) at the incorrect word and corrects it. Cursor synchronization means (15) makes it possible to synchronize text cursor (TC) with audio cursor (AC) or audio cursor (AC) with text cursor (TC) so the positioning of the respective cursor (AC, TC) is simplified considerably.
    Type: Grant
    Filed: February 13, 2012
    Date of Patent: February 19, 2013
    Assignee: Nuance Communications Austria GmbH
    Inventor: Wolfgang Gschwendtner
  • Patent number: 8374873
    Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.
    Type: Grant
    Filed: August 11, 2009
    Date of Patent: February 12, 2013
    Assignee: Morphism, LLC
    Inventor: James H. Stephens, Jr.
  • Patent number: 8374864
    Abstract: In one embodiment, a method includes receiving at a communication device an audio communication and a transcribed text created from the audio communication, and generating a mapping of the transcribed text to the audio communication independent of transcribing the audio. The mapping identifies locations of portions of the text in the audio communication. An apparatus for mapping the text to the audio is also disclosed.
    Type: Grant
    Filed: March 17, 2010
    Date of Patent: February 12, 2013
    Assignee: Cisco Technology, Inc.
    Inventor: Jim Kerr
  • Patent number: 8374879
    Abstract: Systems and methods are described for speech systems that utilize an interaction manager to manage interactions—also known as dialogues—from one or more applications. The interactions are managed properly even if multiple applications use different grammars. The interaction manager maintains an interaction list. An application wishing to utilize the speech system submits one or more interactions to the interaction manager. Interactions are normally processed in the order in which they are received. An exception to this rule is an interaction that is configured by an application to be processed immediately, which causes the interaction manager to place the interaction at the front of the interaction list of interactions. If an application has designated an interaction to interrupt a currently processing interaction, then the newly submitted application will interrupt any interaction currently being processed and, therefore, it will be processed immediately.
    Type: Grant
    Filed: December 16, 2005
    Date of Patent: February 12, 2013
    Assignee: Microsoft Corporation
    Inventors: Stephen Russell Falcon, Clement Yip, Dan Banay, David Miller
  • Patent number: 8370148
    Abstract: Disclosed herein are systems, methods, and computer readable-media for answering a communication notification. The method for answering a communication notification comprises receiving a notification of communication from a user, converting information related to the notification to speech, outputting the information as speech to the user, and receiving from the user an instruction to accept or ignore the incoming communication associated with the notification. In one embodiment, information related to the notification comprises one or more of a telephone number, an area code, a geographic origin of the request, caller id, a voice message, address book information, a text message, an email, a subject line, an importance level, a photograph, a video clip, metadata, an IP address, or a domain name. Another embodiment involves notification assigned an importance level and repeat attempts at notification if it is of high importance.
    Type: Grant
    Filed: April 14, 2008
    Date of Patent: February 5, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Horst Schroeter
  • Patent number: 8370132
    Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.
    Type: Grant
    Filed: November 21, 2005
    Date of Patent: February 5, 2013
    Assignee: Verizon Services Corp.
    Inventor: Adrian E. Conway
  • Patent number: 8355918
    Abstract: A method (10) in a speech recognition application callflow can include the steps of assigning (11) an individual option and a pre-built grammar to a same prompt, treating (15) the individual option as a valid output of the pre-built grammar if the individual option is a potential valid match to a recognition phrase (12) or an annotation (13) in the pre-built grammar, and treating (14) the individual option as an independent grammar from the pre-built grammar if the individual option fails to be a potential valid match to the recognition phrase or the annotation in the pre-built grammar.
    Type: Grant
    Filed: January 5, 2012
    Date of Patent: January 15, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Ciprian Agapi, Felipe Gomez, James R. Lewis, Vanessa V. Michelini
  • Patent number: 8346560
    Abstract: A state oriented dialog design apparatus and method facilitates creation of natural language dialogs and provides data structures for voice user interfaces. The dialog design apparatus may include inputting means for receiving a user's prompt; response generating means for the user to generating at least one response; dialog structure generating means for structurally managing the user's input and response; and output means for outputting and displaying at least one dialog structure. A state used in the dialog design apparatus and method may include at least one system prompt and at least one response, and a linking unit may link a first state to a second state related to the first state, link the second state to a third state, and so on until certain system actions are achieved. A loop detecting unit in the dialog design apparatus and method detects and identifies loops in the dialog structure.
    Type: Grant
    Filed: May 1, 2009
    Date of Patent: January 1, 2013
    Assignee: Alpine Electronics, Inc
    Inventors: Inci Ozkaragoz, Yan Wang, Benjamin Ao