Pattern Display Patents (Class 704/276)

SOUNDTRACK FOR ELECTRONIC TEXT

Publication number: 20140142954

Abstract: A soundtrack creation method and user playback system for soundtracks synchronized to electronic text. Synchronization is achieved by maintaining a reading speed variable indicative of the user's reading speed. The system provides for multiple channels of audio to enable concurrent playback of two or more partially or entirely overlapping audio regions so as to create an audio output having, for example, sound effects, ambience, music or other audio features that are triggered to playback at specific portions in the electronic text to enhance the reading experience.

Type: Application

Filed: January 28, 2014

Publication date: May 22, 2014

Applicant: BOOKTRACK HOLDINGS LIMITED

Inventors: PAUL CHARLES CAMERON, MARK STEVEN CAMERON, RUI ZHANG, ANDREW RUSSELL DAVENPORT, PAUL ANTHONY MCGRATH
Systems, methods and automated technologies for translating words into music and creating music pieces

Patent number: 8731943

Abstract: Systems, methods and computer program products are provided for translating a natural language into music. Through systematic parsing, music compositions can be created. These compositions can be created by one or more persons who do not speak the same natural language.

Type: Grant

Filed: February 5, 2010

Date of Patent: May 20, 2014

Assignee: Little Wing World LLC

Inventors: Nicolle Ruetz, David Warhol
Automatic speech analysis

Patent number: 8725518

Abstract: A system for providing automatic quality management regarding a level of conformity to a specific accent, including, a recording system, a statistical model database with statistical models representing speech data of different levels of conformity to a specific accent, a speech analysis system, a quality management system. Wherein the recording system is adapted to record one or more samples of a speakers speech and provide it to the speech analysis system for analysis, and wherein the speech analysis system is adapted to provide a score of the speakers speech samples to the quality management system by analyzing the recorded speech samples relative to the statistical models in the statistical model database.

Type: Grant

Filed: April 25, 2006

Date of Patent: May 13, 2014

Assignee: Nice Systems Ltd.

Inventors: Moshe Waserblat, Barak Eilam
AUDIO TRACKER APPARATUS

Publication number: 20140129235

Abstract: Apparatus comprising a receiver configured to receive a first audio signal, a signal characteriser configured to determine at least one characteristic associated with the first audio signal, a comparator configured to compare the at least one characteristic against at least one characteristic associated with at least one further audio signal, and a display configured to display the at least one characteristic associated with at least one further audio signal dependent on the first audio signal characteristic.

Type: Application

Filed: June 17, 2011

Publication date: May 8, 2014

Applicant: Nokia Corporation

Inventor: Mikko Veli Aimo Suvanto
Methods for presenting speech blocks from a plurality of audio input data streams to a user in an interface

Patent number: 8719032

Abstract: A clear picture of who is speaking in a setting where there are multiple input sources (e.g., a conference room with multiple microphones) can be obtained by comparing input channels against each other. The data from each channel can not only be compared, but can also be organized into portions which logically correspond to statements by a user. These statements, along with information regarding who is speaking, can be presented in a user friendly format via an interactive timeline which can be updated in real time as new audio input data is received.

Type: Grant

Filed: December 11, 2013

Date of Patent: May 6, 2014

Assignee: Jefferson Audio Video Systems, Inc.

Inventors: Matthew David Bader, Nathan David Cole
Computerized information and display apparatus

Patent number: 8719038

Abstract: Computerized apparatus for obtaining and displaying information, such as for example directions to a desired entity or organization. In one embodiment, the computerized apparatus is configured to receive user speech input and enable performance of various tasks, such as obtaining desired information relating to indoor entities, maps or directions, or any number of other topics. The obtained data may also, in various variants, be displayed in various formats and relative to other entities nearby.

Type: Grant

Filed: January 28, 2013

Date of Patent: May 6, 2014

Assignee: West View Research, LLC

Inventor: Robert F. Gazdzinski
Content and advertising service using one server for the content, sending it to another for advertisement and text-to-speech synthesis before presenting to user

Patent number: 8706494

Abstract: Methods and systems for providing a network-accessible text-to-speech synthesis service are provided. The service accepts content as input. After extracting textual content from the input content, the service transforms the content into a format suitable for high-quality speech synthesis. Additionally, the service produces audible advertisements, which are combined with the synthesized speech. The audible advertisements themselves can be generated from textual advertisement content.

Type: Grant

Filed: August 29, 2011

Date of Patent: April 22, 2014

Assignee: Aeromee Development L.L.C.

Inventor: James H. Stephens, Jr.
Method and device for mnemonic contact image association

Patent number: 8706485

Abstract: The present invention pertains to method and a communication device (100) for associating a contact record pertaining to a remote speaker (220) with a mnemonic image (191) based on attributes of the speaker (220). The method comprises receiving voice data of the speaker (220); in a communication session with a source device (200). A source determination representing the speaker (220) is registered, and then the received voice data is analyzed so that voice data characteristics can be extracted. Based on these voice data characteristics a mnemonic image (191) can be selected, and associated to a contact record in which the source determination is stored. The mnemonic image (191) may be selected among images previously stored in the device, or derived through editing of such images.

Type: Grant

Filed: May 17, 2011

Date of Patent: April 22, 2014

Assignees: Sony Corporation, Sony Mobile Communications AB

Inventor: Joakim Martensson
Synchronise an audio cursor and a text cursor during editing

Patent number: 8706495

Abstract: A speech recognition device (1) processes speech data (SD) of a dictation and thus establishes recognized text information (ETI) and link information (LI) of the dictation. In a synchronous playback mode of the speech recognition device (1), during the acoustic playback of the dictation a correction device (10) synchronously marks the word of the recognized text information (ETI) which word relates to the speech data (SD) just played back marked by the link information (LI) is marked synchronously, the just marked word featuring the position of an audio cursor (AC). When a user of the speech recognition device (1) recognizes an incorrect word, he positions a text cursor (TC) at the incorrect word and corrects it.

Type: Grant

Filed: January 17, 2013

Date of Patent: April 22, 2014

Assignee: Nuance Communications, Inc.

Inventor: Wolfgang Gschwendtner
Caption and/or metadata synchronization for replay of previously or simultaneously recorded live programs

Patent number: 8707381

Abstract: A synchronization process between captioning data and/or corresponding metatags and the associated media file parses the media file, correlates the caption information and/or metatags with segments of the media file, and provides a capability for textual search and selection of particular segments. A time-synchronized version of the captions is created that is synchronized to the moment that the speech is uttered in the recorded media. The caption data is leveraged to enable search engines to index not merely the title of a video, but the entirety of what was said during the video as well as any associated metatags relating to contents of the video. Further, because the entire media file is indexed, a search can request a particular scene or occurrence within the event recorded by the media file, and the exact moment within the media relevant to the search can be accessed and played for the requester.

Type: Grant

Filed: September 21, 2010

Date of Patent: April 22, 2014

Assignee: Caption Colorado L.L.C.

Inventors: Richard T. Polumbus, Michael W. Homyack
Unified treatment of data-sparseness and data-overfitting in maximum entropy modeling

Patent number: 8700403

Abstract: A method of statistical modeling is provided which includes constructing a statistical model and incorporating Gaussian priors during feature selection and during parameter optimization for the construction of the statistical model.

Type: Grant

Filed: November 3, 2005

Date of Patent: April 15, 2014

Assignee: Robert Bosch GmbH

Inventors: Fuliang Weng, Lin Zhao
Synchronous transcript display with audio/video stream in web cast environment

Patent number: 8682672

Abstract: A system and method is described that permits synchronization of a transcript with an audio/video stream of a webcast. The system also permits a user to perform a search of the transcript and then to jump in the webcast audio/video stream to the point identified during the search.

Type: Grant

Filed: September 17, 2004

Date of Patent: March 25, 2014

Assignee: ON24, Inc.

Inventors: Tommy Ha, Kamalaksha Ghosh
Web-based audio transcription tool

Patent number: 8676590

Abstract: A computer-implemented technique for transcribing audio data includes generating, along a vertical axis on a display of a client device, an image representing audio content. The technique further includes receiving, from a user of the client device, a selection of a portion of the image; and generating, via an audio module of the client device, an audio output corresponding to the selected portion of the image. The technique further includes receiving, from the user, a selection indicating a position along the vertical axis on the display to enter a text portion representing the audio output, wherein the position is aligned to the selected portion of the image. The technique further includes receiving, from the user, the text portion representing the audio output; and displaying, on the display, the text portion at the position, wherein the text portion extends along a horizontal axis on the display.

Type: Grant

Filed: September 26, 2012

Date of Patent: March 18, 2014

Assignee: Google Inc.

Inventors: Jeffrey Scott Sorensen, Masayuki Nanzawa, Ravindran Rajakumar
Script control for lip animation in a scene generated by a computer rendering engine

Patent number: 8674996

Abstract: A system for controlling a rendering engine by using specialized commands. The commands are used to generate a production, such as a television show, at an end-user's computer that executes the rendering engine. In one embodiment, the commands are sent over a network, such as the Internet, to achieve broadcasts of video programs at very high compression and efficiency. Commands for setting and moving camera viewpoints, animating characters, and defining or controlling scenes and sounds are described. At a fine level of control math models and coordinate systems can be used make specifications. At a coarse level of control the command language approaches the text format traditionally used in television or movie scripts. Simple names for objects within a scene are used to identify items, directions and paths. Commands are further simplified by having the rendering engine use defaults when specifications are left out.

Type: Grant

Filed: October 27, 2008

Date of Patent: March 18, 2014

Assignee: Quonsil PL. 3, LLC

Inventor: Charles J. Kulas
System and method for audio snippet generation from a subset of music tracks

Patent number: 8666749

Abstract: The disclosure includes a system and method for generating audio snippets from a subset of audio tracks. In some embodiments an audio snippet is an audio summary of a group or collection of songs.

Type: Grant

Filed: January 17, 2013

Date of Patent: March 4, 2014

Assignee: Google Inc.

Inventors: Amarnag Subramanya, Jennifer Gillenwater, Garth Griffin, Fernando Pereira, Douglas Eck
System and method for answering a communication notification

Patent number: 8655662

Abstract: Disclosed herein are systems, methods, and computer readable-media for answering a communication notification. The method for answering a communication notification comprises receiving a notification of communication from a user, converting information related to the notification to speech, outputting the information as speech to the user, and receiving from the user an instruction to accept or ignore the incoming communication associated with the notification. In one embodiment, information related to the notification comprises one or more of a telephone number, an area code, a geographic origin of the request, caller id, a voice message, address book information, a text message, an email, a subject line, an importance level, a photograph, a video clip, metadata, an IP address, or a domain name. Another embodiment involves notification assigned an importance level and repeat attempts at notification if it is of high importance.

Type: Grant

Filed: November 29, 2012

Date of Patent: February 18, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Horst Schroeter
Context based online advertising

Patent number: 8655667

Abstract: A software and/or hardware facility for inferring user context and delivering advertisements, such as coupons, using natural language and/or sentiment analysis is disclosed. The facility may infer context information based on a user's emotional state, attitude, needs, or intent from the user's interaction with or through a mobile device. The facility may then determine whether it is appropriate to deliver an advertisement to the user and select an advertisement for delivery. The facility may also determine an appropriate expiration time and/or discount amount for the advertisement.

Type: Grant

Filed: November 19, 2012

Date of Patent: February 18, 2014

Assignee: Microsoft Corporation

Inventors: Raman Chandrasekar, Eric I-Chao Chang, Michael Tsang, Tian Bai
Method and systems for measuring user performance with speech-to-text conversion for dictation systems

Patent number: 8639512

Abstract: A computer-implemented system and method for evaluating the performance of a user using a dictation system is provided. The system and method include receiving a text or transcription file generated from user audio. A performance metric, such as, for example, words/minute or errors is generated based on the transcription file. The performance metric is provided to an administrator so the administrator can evaluate the performance of the user using the dictation system.

Type: Grant

Filed: April 21, 2009

Date of Patent: January 28, 2014

Assignee: nVoq Incorporated

Inventors: Brian Marquette, Charles Corfield, Todd Espy
Method for creating a new summary of an audiovisual document that already includes a summary and reports and a receiver that can implement said method

Patent number: 8634708

Abstract: The invention relates to a method for creating a new roundup of an audiovisual document previously recorded in a device. The document contains two parts, one being the roundup and the other composed of a plurality of reports. The roundup is itself divided into a plurality of parts. The device first searches for the associations between the roundup parts and the reports, and detects the reports that are not associated with roundup parts. Then, summaries are created for the reports not associated with the roundup, and incorporated into the initial roundup to create a new roundup. In this manner, the user can easily select any report from the roundup part associated with this report. The invention also relates to the receiver suitable for implementing the method.

Type: Grant

Filed: December 20, 2007

Date of Patent: January 21, 2014

Assignee: Thomson Licensing

Inventors: Louis Chevallier, Claire-Helene Demarty, Lionel Oisel
Raising the visibility of a voice-activated user interface

Patent number: 8635075

Abstract: A system is configured to enable a user to assert voice-activated commands. When the user issues a non-ambiguous command, the system activates a corresponding control. The area of activity on the user interface is visually highlighted to emphasize to the user that what they spoke caused an action. In one specific embodiment, the highlighting involves floating text the user uttered to a visible user interface component.

Type: Grant

Filed: October 12, 2009

Date of Patent: January 21, 2014

Assignee: Microsoft Corporation

Inventor: Felix Andrew
Speaker and call characteristic sensitive open voice search

Patent number: 8630860

Abstract: Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results.

Type: Grant

Filed: March 3, 2011

Date of Patent: January 14, 2014

Assignee: Nuance Communications, Inc.

Inventors: Shilei Zhang, Shenghua Bao, Wen Liu, Yong Qin, Zhiwei Shuang, Jian Chen, Zhong Su, Qin Shi, William F. Ganong, III
DETERMINING HOTWORD SUITABILITY

Publication number: 20140012586

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining hotword suitability. In one aspect, a method includes receiving speech data that encodes a candidate hotword spoken by a user, evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, generating a hotword suitability score for the candidate hotword based on evaluating the speech data or a transcription of the candidate hotword, using one or more predetermined criteria, and providing a representation of the hotword suitability score for display to the user.

Type: Application

Filed: August 6, 2012

Publication date: January 9, 2014

Applicant: GOOGLE INC.

Inventors: Andrew E. Rubin, Johan Schalkwyk, Maria Carolina Parada San Martin
Insertion of sounds into audio content according to pattern

Patent number: 8626493

Abstract: Sounds are inserted into audio content according to a pattern. A library stores humanly perceptible voice sounds. Pattern control information is received that is associated with a device recording the audio content. A pattern is retrieved and washing machine sounds are inserted into the audio content according to the pattern. The humanly perceptible voice sounds are inserted into the audio content according to the pattern to generate a signed audio recording.

Type: Grant

Filed: April 26, 2013

Date of Patent: January 7, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Steven N. Tischer
Character mouth shape control method

Patent number: 8612228

Abstract: A section corresponding to a given duration is sampled from sound data that indicates the voice of a player collected by a microphone, and a vocal tract cross-sectional area function of the sampled section is calculated. The vertical dimension of the mouth is calculated from a throat-side average cross-sectional area of the vocal tract cross-sectional area function, and the area of the mouth is calculated from a mouth-side average cross-sectional area. The transverse dimension of the mouth is calculated from the area of the mouth and the vertical dimension of the mouth.

Type: Grant

Filed: March 26, 2010

Date of Patent: December 17, 2013

Assignee: Namco Bandai Games Inc.

Inventor: Hiroyuki Hiraishi
System-initiated speech interaction

Patent number: 8600763

Abstract: Whenever an event occurs on a computing system which will accept a response from a user of the system, the system automatically determines whether or not to enable speech interaction with the system for the event response. Whenever speech interaction is enabled with the system for the event response, the system provides a notification to the user which informs the user of the event and their options for responding thereto, where these options include responding verbally. Whenever the user responds within a prescribed period of time via a voice command (VC), the system attempts to recognize the VC. Whenever the VC is successfully recognized, the system responds appropriately to the VC.

Type: Grant

Filed: June 4, 2010

Date of Patent: December 3, 2013

Assignee: Microsoft Corporation

Inventors: Alice Jane Bernheim Brush, Paul Johns, Jen Anderson, Connie Missimer, Seung Yang, Jean Ku
Highlighting of voice message transcripts

Patent number: 8588378

Abstract: A computer-implemented voice mail method includes obtaining an electronic audio file of a recorded user message directed to a telephone user, automatically generating a transcript of the recorded user message, and identifying locations in the transcript in coordination with timestamps in the recorded user message so that successive portions of the transcript can be highlighted in coordination with playing of the recorded user message. The method also include identifying one or characteristics of the message using meta data relating to the recorded user message, and storing the recorded user message and information about the identified locations of the recorded user message.

Type: Grant

Filed: July 15, 2010

Date of Patent: November 19, 2013

Assignee: Google Inc.

Inventors: Benedict Davies, Christian Brunschen
Methods for statistical analysis of speech

Patent number: 8583434

Abstract: Computer-implemented methods and apparatus are provided to facilitate the recognition of the content of a body of speech data. In one embodiment, a method for analyzing verbal communication is provided, comprising acts of producing an electronic recording of a plurality of spoken words; processing the electronic recording to identify a plurality of word alternatives for each of the spoken words, each of the plurality of word alternatives being identified by comparing a portion of the electronic recording with a lexicon, and each of the plurality of word alternatives being assigned a probability of correctly identifying a spoken word; loading the word alternatives and the probabilities to a database for subsequent analysis; and examining the word alternatives and the probabilities to determine at least one characteristic of the plurality of spoken words.

Type: Grant

Filed: January 29, 2008

Date of Patent: November 12, 2013

Assignee: CallMiner, Inc.

Inventor: Jeffrey A. Gallino
Voice recognition apparatus and recording medium storing voice recognition program

Patent number: 8560317

Abstract: A vocabulary dictionary storing unit for storing a plurality of words in advance, a vocabulary dictionary managing unit for extracting recognition target words, a matching unit for calculating a degree of matching with the recognition target words based on an accepted voice, a result output unit for outputting, as a recognition result, a word having a best score from a result of calculating the degree of matching, and an extraction criterion information managing unit for changing extraction criterion information according to a result of monitoring by a monitor control unit are provided. The vocabulary dictionary storing unit further includes a scale information storing unit for storing scale information serving as a scale at the time of extracting the recognition target words, and an extraction criterion information storing unit for storing extraction criterion information indicating a criterion of the recognition target words at the time of extracting the recognition target words.

Type: Grant

Filed: September 18, 2006

Date of Patent: October 15, 2013

Assignee: Fujitsu Limited

Inventor: Kenji Abe
Training and applying prosody models

Patent number: 8554566

Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.

Type: Grant

Filed: November 29, 2012

Date of Patent: October 8, 2013

Assignee: Morphism LLC

Inventor: James H. Stephens, Jr.
Voice recognition grammar selection based on context

Patent number: 8527279

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving geographical information derived from a non-verbal user action associated with a first computing device. The non-verbal user action implies an interest of a user in a geographic location. The method also includes identifying a grammar associated with the geographic location using the derived geographical information and outputting a grammar indicator for use in selecting the identified grammar for voice recognition processing of vocal input from the user.

Type: Grant

Filed: August 23, 2012

Date of Patent: September 3, 2013

Assignee: Google Inc.

Inventors: David P. Singleton, Debajit Ghosh
AUDIO PROCESSING APPARATUS

Publication number: 20130226593

Abstract: An apparatus comprising: an audio source determiner configured to determine at least one audio source; a visualizer configured to generate a visual representation associated with the at least one audio source; and a controller configured to process an audio signal associated with the at least one audio source dependent on interaction with the visual representation.

Type: Application

Filed: November 12, 2010

Publication date: August 29, 2013

Applicant: Nokia Corporation

Inventors: Birgir Magnusson, Koray Ozcan
Acoustic model adaptation methods based on pronunciation variability analysis for enhancing the recognition of voice of non-native speaker and apparatus thereof

Patent number: 8515753

Abstract: The example embodiment of the present invention provides an acoustic model adaptation method for enhancing recognition performance for a non-native speaker's speech. In order to adapt acoustic models, first, pronunciation variations are examined by analyzing a non-native speaker's speech. Thereafter, based on variation pronunciation of a non-native speaker's speech, acoustic models are adapted in a state-tying step during a training process of acoustic models. When the present invention for adapting acoustic models and a conventional acoustic model adaptation scheme are combined, more-enhanced recognition performance can be obtained. The example embodiment of the present invention enhances recognition performance for a non-native speaker's speech while reducing the degradation of recognition performance for a native speaker's speech.

Type: Grant

Filed: March 30, 2007

Date of Patent: August 20, 2013

Assignee: Gwangju Institute of Science and Technology

Inventors: Hong Kook Kim, Yoo Rhee Oh, Jae Sam Yoon
Word-level correction of speech input

Patent number: 8494852

Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.

Type: Grant

Filed: October 27, 2010

Date of Patent: July 23, 2013

Assignee: Google Inc.

Inventors: Michael J. LeBeau, William J. Byrne, John Nicholas Jitkoff, Brandon M. Ballinger, Trausti Kristjansson
Sound signal processing apparatus and method

Patent number: 8494668

Abstract: Character value of a sound signal is extracted for each unit portion, and degrees of similarity between the character values of the individual unit portions are calculated and arranged in a matrix configuration. The matrix has arranged in each column the degrees of similarity acquired by comparing, for each of the unit portions, the sound signal and a delayed sound signal obtained by delaying the sound signal by a time difference equal to an integral multiple of a time length of the unit portion, and it has a plurality of the columns in association with different time differences. Repetition probability is calculated for each of the columns corresponding to the different time differences in the matrix. A plurality of peaks in a distribution of the repetition probabilities are identified. The loop region in the sound signal is identified by collating a reference matrix with the degree of similarity matrix.

Type: Grant

Filed: February 19, 2009

Date of Patent: July 23, 2013

Assignee: Yamaha Corporation

Inventors: Bee Suan Ong, Sebastian Streich, Takuya Fujishima, Keita Arimoto
Arrangement for creating and using a phonetic-alphabet representation of a name of a party to a call

Patent number: 8484034

Abstract: A first party creates and edits a phonetic-alphabet representation of its name. The phonetic representation is conveyed to a second party as “caller-identification” information by messages that set up a call between the parties. The phonetic representation of the name is displayed to the second party, converted to speech, and/or converted to an alphabet of a language of the second party and then displayed to the second party.

Type: Grant

Filed: March 31, 2008

Date of Patent: July 9, 2013

Assignee: Avaya Inc.

Inventors: Paul Roller Michaelis, David Mohler, Charles Wrobel
Word-level correction of speech input

Patent number: 8478590

Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.

Type: Grant

Filed: September 30, 2011

Date of Patent: July 2, 2013

Assignee: Google Inc.

Inventors: Michael J. LeBeau, William J. Byrne, John Nicholas Jitkoff, Brandon M. Ballinger, Trausti Kristjansson
Systems, methods and computer program products providing signed visual and/or audio records for digital distribution using patterned recognizable artifacts

Patent number: 8452604

Abstract: Recognizable visual and/or audio artifacts, such as recognizable sounds, are introduced into visual and/or audio content in an identifying pattern to generate a signed visual and/or audio recording for distribution over a digital communications medium. A library of images and/or sounds may be provided, and the image and/or sounds from the library may be selectively inserted to generate the identifying pattern. The images and/or sounds may be inserted responsive to one or more parameters associated with creation of the visual and/or audio content. A representation of the identifying pattern may be generated and stored in a repository, e.g., an independent repository configured to maintain creative rights information. The stored pattern may be retrieved from the repository and compared to an unidentified visual and/or audio recording to determine an identity thereof.

Type: Grant

Filed: August 15, 2005

Date of Patent: May 28, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Steven Tischer
Method and Apparatus for Interpolating Script Data

Publication number: 20130124213

Abstract: Provided in some embodiments is a computer implemented method that includes providing script data including script words indicative of dialogue words to be spoken, providing audio data corresponding to at least a portion of the dialogue words to be spoken, wherein the audio data includes timecodes associated with dialogue words, generating a sequential alignment of the script words to the dialogue words, matching at least some of the script words to corresponding dialogue words to determine alignment points, determining corresponding timecodes for unmatched script words using interpolation based on the timecodes associated with matching script words, and generating time-aligned script data including the script words and their corresponding time codes.

Type: Application

Filed: May 28, 2010

Publication date: May 16, 2013

Inventors: Jerry R. Scoggins, II, Walter W. Chang
METHOD AND APPARATUS FOR PROCESSING SCRIPTS AND RELATED DATA

Publication number: 20130124202

Abstract: Provided in some embodiments is a method including receiving ordered script words are indicative of dialogue words to be spoken, receiving audio data corresponding to at least a portion of the dialogue words to be spoken and including timecodes associated with dialogue words, generating a matrix of the ordered script words versus the dialogue words, aligning the matrix to determine hard alignment points that include matching consecutive sequences of ordered script words with corresponding sequences of dialogue words, partitioning the matrix of ordered script words into sub-matrices bounded by adjacent hard-alignment points and including corresponding sub-sets the script and dialogue words between the hard-alignment points, and aligning each of the sub-matrices.

Type: Application

Filed: May 28, 2010

Publication date: May 16, 2013

Inventor: Walter W. Chang
Method and Apparatus for Time Synchronized Script Metadata

Publication number: 20130124212

Abstract: A method includes receiving script data including script words for dialogue, receiving audio data corresponding to at least a portion of the dialogue, wherein the audio data includes timecodes associated with dialogue words, generating a sequential alignment of the script words to the dialogue words, matching at least some of the script words to corresponding dialogue words to determine hard alignment points, partitioning the sequential alignment of script words into alignment sub-sets, wherein the bounds of the alignment sub-subsets are defined by adjacent hard-alignment points, and wherein the alignment subsets includes a sub-set of the script words and a corresponding sub-set of dialogue words that occur between the hard-alignment points, determining corresponding timecodes for a sub-set of script words in a sub-subset based on the timecodes associated with the sub-set of dialogue words, and generating time-aligned script data including the sub-set of script words and their corresponding timecodes.

Type: Application

Filed: May 28, 2010

Publication date: May 16, 2013

Inventors: Jerry R. Scoggins, II, Walter W. Chang, David A. Kuspa, Charles E. Van Winkle, Simon R. Hayhurst
Touch anywhere to speak

Patent number: 8412531

Abstract: The present invention provides a user interface for providing press-to-talk-interaction via utilization of a touch-anywhere-to-speak module on a mobile computing device. Upon receiving an indication of a touch anywhere on the screen of a touch screen interface, the touch-anywhere-to-speak module activates the listening mechanism of a speech recognition module to accept audible user input and displays dynamic visual feedback of a measured sound level of the received audible input. The touch-anywhere-to-speak module may also provide a user a convenient and more accurate speech recognition experience by utilizing and applying the data relative to a context of the touch (e.g., relative location on the visual interface) in correlation with the spoken audible input.

Type: Grant

Filed: June 10, 2009

Date of Patent: April 2, 2013

Assignee: Microsoft Corporation

Inventors: Anne K. Sullivan, Lisa Stifelman, Kathleen J. Lee, Su Chuin Leong
Clipping detection device and method

Patent number: 8392199

Abstract: A clipping detection device calculates an amplitude distribution of an input signal for each predetermined period, calculates a deflection degree of the distribution on the basis of the calculated amplitude distribution, and then detects clipping of a communication signal on the basis of the calculated deflection degree of the distribution.

Type: Grant

Filed: May 21, 2009

Date of Patent: March 5, 2013

Assignee: Fujitsu Limited

Inventors: Takeshi Otani, Masakiyo Tanaka, Yasuji Ota, Shusaku Ito
Device and method for automatic participant identification in a recorded multimedia stream

Patent number: 8390669

Abstract: The present disclosure discloses a method for identifying individuals in a multimedia stream originating from a video conferencing terminal or a Multipoint Control Unit, including executing a face detection process on the multimedia stream; defining subsets including facial images of one or more individuals, where the subsets are ranked according to a probability that their respective one or more individuals will appear in a video stream; comparing a detected face to the subsets in consecutive order starting with a most probable subset, until a match is found; and storing an identity of the detected face as searchable metadata in a content database in response to the detected face matching a facial image in one of the subsets.

Type: Grant

Filed: December 15, 2009

Date of Patent: March 5, 2013

Assignee: Cisco Technology, Inc.

Inventors: Jason Catchpole, Craig Cockerton
VISUALIZING, NAVIGATING AND INTERACTING WITH AUDIO CONTENT

Publication number: 20130054249

Abstract: Methods and arrangements for visually representing audio content in a voice application. A display is connected to a voice application, and an image is displayed on the display, the image comprising a main portion and at least one subsidiary portion, the main portion representing a contextual entity of the audio content and the at least one subsidiary portion representing at least one participatory entity of the audio content. The at least one subsidiary portion is displayed without text, and the image is changed responsive to changes in audio content in the voice application.

Type: Application

Filed: August 24, 2011

Publication date: February 28, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Amit Anil Nanavati, Nitendra Rajput
VISUALIZING, NAVIGATING AND INTERACTING WITH AUDIO CONTENT

Publication number: 20130054250

Abstract: Methods and arrangements for visually representing audio content in a voice application. A display is connected to a voice application, and an image is displayed on the display, the image comprising a main portion and at least one subsidiary portion, the main portion representing a contextual entity of the audio content and the at least one subsidiary portion representing at least one participatory entity of the audio content. The at least one subsidiary portion is displayed without text, and the image is changed responsive to changes in audio content in the voice application.

Type: Application

Filed: August 29, 2012

Publication date: February 28, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Amit Anil Nanavati, Nitendra Rajput
Synchronise an audio cursor and a text cursor during editing

Patent number: 8380509

Abstract: A speech recognition device (1) processes speech data (SD) of a dictation and establishes recognized text information (ETI) and link information (LI) of the dictation. In a synchronous playback mode of the speech recognition device (1), during acoustic playback of the dictation a correction device (10) synchronously marks the word of the recognized text information (ETI) which word relates to speech data (SD) just played back marked by link information (LI) is marked synchronously, the just marked word featuring the position of an audio cursor (AC). When a user of the speech recognition device (1) recognizes an incorrect word, he positions a text cursor (TC) at the incorrect word and corrects it. Cursor synchronization means (15) makes it possible to synchronize text cursor (TC) with audio cursor (AC) or audio cursor (AC) with text cursor (TC) so the positioning of the respective cursor (AC, TC) is simplified considerably.

Type: Grant

Filed: February 13, 2012

Date of Patent: February 19, 2013

Assignee: Nuance Communications Austria GmbH

Inventor: Wolfgang Gschwendtner
Systems and methods for managing interactions from multiple speech-enabled applications

Patent number: 8374879

Abstract: Systems and methods are described for speech systems that utilize an interaction manager to manage interactions—also known as dialogues—from one or more applications. The interactions are managed properly even if multiple applications use different grammars. The interaction manager maintains an interaction list. An application wishing to utilize the speech system submits one or more interactions to the interaction manager. Interactions are normally processed in the order in which they are received. An exception to this rule is an interaction that is configured by an application to be processed immediately, which causes the interaction manager to place the interaction at the front of the interaction list of interactions. If an application has designated an interaction to interrupt a currently processing interaction, then the newly submitted application will interrupt any interaction currently being processed and, therefore, it will be processed immediately.

Type: Grant

Filed: December 16, 2005

Date of Patent: February 12, 2013

Assignee: Microsoft Corporation

Inventors: Stephen Russell Falcon, Clement Yip, Dan Banay, David Miller
Correlation of transcribed text with corresponding audio

Patent number: 8374864

Abstract: In one embodiment, a method includes receiving at a communication device an audio communication and a transcribed text created from the audio communication, and generating a mapping of the transcribed text to the audio communication independent of transcribing the audio. The mapping identifies locations of portions of the text in the audio communication. An apparatus for mapping the text to the audio is also disclosed.

Type: Grant

Filed: March 17, 2010

Date of Patent: February 12, 2013

Assignee: Cisco Technology, Inc.

Inventor: Jim Kerr
Training and applying prosody models

Patent number: 8374873

Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.

Type: Grant

Filed: August 11, 2009

Date of Patent: February 12, 2013

Assignee: Morphism, LLC

Inventor: James H. Stephens, Jr.
System and method for answering a communication notification

Patent number: 8370148

Abstract: Disclosed herein are systems, methods, and computer readable-media for answering a communication notification. The method for answering a communication notification comprises receiving a notification of communication from a user, converting information related to the notification to speech, outputting the information as speech to the user, and receiving from the user an instruction to accept or ignore the incoming communication associated with the notification. In one embodiment, information related to the notification comprises one or more of a telephone number, an area code, a geographic origin of the request, caller id, a voice message, address book information, a text message, an email, a subject line, an importance level, a photograph, a video clip, metadata, an IP address, or a domain name. Another embodiment involves notification assigned an importance level and repeat attempts at notification if it is of high importance.

Type: Grant

Filed: April 14, 2008

Date of Patent: February 5, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Horst Schroeter

prev 1 2 3 4 5 6 7 … next