Speech Patents (Class 434/185)
  • Patent number: 9285513
    Abstract: A display apparatus at least comprises a touchscreen, and a symmetric diffusion film (SDF) disposed above the touchscreen. The SDF comprises at least two different materials, including a first material mixed with a second material. The first material has a first refractive index and the second material has a second refractive index, and the first refractive index is different from the second refractive index.
    Type: Grant
    Filed: August 26, 2013
    Date of Patent: March 15, 2016
    Assignee: INNOLUX CORPORATION
    Inventors: Minoru Shibazaki, Kazuyuki Hashimoto, Yoshitaka Haruyama
  • Patent number: 9240188
    Abstract: In one embodiment, the system and method for expressive language development; a method for detecting autism in a natural language environment using a microphone, sound recorder, and a computer programmed with software for the specialized purpose of processing recordings captured by the microphone and sound recorder combination; and the computer programmed to execute a method that includes segmenting an audio signal captured by the microphone and sound recorder combination using the computer programmed for the specialized purpose into a plurality recording segments. The method further includes determining which of the plurality of recording segments correspond to a key child. The method also includes extracting acoustic parameters of the key child recordings and comparing the acoustic parameters of the key child recordings to known acoustic parameters for children. The method returns a determination of a likelihood of autism.
    Type: Grant
    Filed: January 23, 2009
    Date of Patent: January 19, 2016
    Assignee: Lena Foundation
    Inventors: Terrance D. Paul, Dongxin D. Xu, Sharmistha S. Gray, Umit Yapanel, Jill S. Gilkerson, Jeffrey A. Richards
  • Patent number: 9177551
    Abstract: Disclosed are systems, methods and computer-readable media for enabling speech processing in a user interface of a device. The method includes receiving an indication of a field and a user interface of a device, the indication also signaling that speech will follow, receiving the speech from the user at the device, the speech being associated with the field, transmitting the speech as a request to public, common network node that receives and processes speech, processing the transmitted speech and returning text associated with the speech to the device and inserting the text into the field. Upon a second indication from the user, the system processes the text in the field as programmed by the user interface. The present disclosure provides a speech mash up application for a user interface of a mobile or desktop device that does not require expensive speech processing technologies.
    Type: Grant
    Filed: May 28, 2008
    Date of Patent: November 3, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Jay Wilpon, Giuseppe Di Fabbrizio, Benjamin J. Stern
  • Patent number: 9082400
    Abstract: Techniques for generating a video sequence of a person based on a text sequence, are disclosed herein. Based on the received text sequence, a processing device generates the video sequence of a person to simulate visual and audible emotional expressions of the person, including using an audio model of the person's voice to generate an audio portion of the video sequence. The emotional expressions in the visual portion of the video sequence are simulated based a priori knowledge about the person. For instance, the a priori knowledge can include photos or videos of the person captured in real life.
    Type: Grant
    Filed: May 4, 2012
    Date of Patent: July 14, 2015
    Assignee: SEYYER, INC.
    Inventors: Behrooz Rezvani, Ali Rouhi
  • Publication number: 20150118661
    Abstract: The present disclosure relates to computing technologies for diagnosis and therapy of language-related disorders. Such technologies enable computer-generated diagnosis and computer-generated therapy delivered over a network to at least one computing device. The diagnosis and therapy are customized for each patient through a comprehensive analysis of the patient's production and reception errors, as obtained from the patient over the network, together with a set of correct responses at each phase of evaluation and therapy.
    Type: Application
    Filed: October 29, 2014
    Publication date: April 30, 2015
    Inventors: Pau-San Haruta, Charisse Si-Fei Haruta, Kieran Bing-Fei Haruta
  • Publication number: 20150111183
    Abstract: An information processing apparatus is disclosed having a storage unit storing a plurality of training text items each including a word, a word string, or a sentence. The information processing apparatus presents a training text item among the plurality of training text items stored in the storage unit as voice output or character string display and calculates the speaking speed based on a voice signal that is input after presenting the training text item. The information processing apparatus compares the calculated speaking speed with a preset target speaking speed and reports the comparison result.
    Type: Application
    Filed: December 26, 2014
    Publication date: April 23, 2015
    Applicant: TERUMO KABUSHIKI KAISHA
    Inventors: Miyuki KOYAMA, Toshihide TANAKA, Tadashi SAMESHIMA
  • Publication number: 20150099247
    Abstract: Apparatus for aiding learning by a person comprises a cover or shield (1) for concealing from the person a part of the person's body, and a webcam (6) and a screen (10) for visually displaying to the person, during concealment of the concealed body part, images of a part of the person's body not in direct view of the person. The apparatus may be used in the learning of a skill, such as hand-writing. In another embodiment, the shield is a collar worn to conceal part of the wearer's body, and the webcam and screen display the concealed body part in real time to the wearer. This apparatus can be used in many applications, such as to learn sports activities or to correct body image, posture or movement.
    Type: Application
    Filed: March 15, 2013
    Publication date: April 9, 2015
    Inventor: Jacklyn Bryant
  • Publication number: 20150072321
    Abstract: Interactive electronic training systems and methods are described herein. Certain embodiments provide preprogrammed video, audio, and/or textual presentations of training materials which provide information related to skills/information to be trained. A scenario including real or animated actors is presented, simulating an interaction. The training system presents related queries for the trainee who audibly responds. The training system stores a score based in part on a comparison of the trainee's response with an answer stored in training system memory. Optionally, the scores are substantially immediately presented by the system to the trainee.
    Type: Application
    Filed: April 17, 2014
    Publication date: March 12, 2015
    Applicant: Breakthrough Performance Tech, LLC
    Inventor: Martin L. Cohen
  • Publication number: 20150072322
    Abstract: Systems, methods, and other embodiments associated with producing an immersive training content module (ITCM) are described. One example system includes a capture logic to acquire information from which the ITCM may be produced. An ITCM may include a set of nodes, a set of measures, a logic to control transitions between nodes during a training session, and a logic to establish values for measures during the training sessions. Therefore, the example system may also include an assessment definition logic to define a set of measures to be included in the ITCM and an interaction logic to define a set of interactions to be included in the ITCM. The ITCM may be written to a computer-readable medium.
    Type: Application
    Filed: November 13, 2014
    Publication date: March 12, 2015
    Inventors: Stacy L. WILLIAMS, Marc BUCHNER
  • Publication number: 20150064666
    Abstract: The present disclosures relates to a control terminal, comprising: a data communication unit for receiving a first user voice by data communication with a first audio device and receiving a second user voice by data communication with a second audio device; a turn information generating unit for generating turn information, which is voice unit information, by using the first and second user voices; and a metalanguage processing unit for determining a conversation pattern of the first and second users by using the turn information, and outputting a reminder message corresponding to a reminder event to the first user when the conversation pattern corresponds to a preset reminder event occurrence condition.
    Type: Application
    Filed: October 7, 2013
    Publication date: March 5, 2015
    Applicant: KOREA ADVANCED INSTITUTE OF SCIENCE AND TECHNOLOGY
    Inventors: June Hwa Song, In Seok Hwang, Chung Kuk Yoo, Chan You Hwang, Young Ki Lee, John Dong Jun Kim, Dong Sun Jennifer Yim, Chul Hong Min
  • Patent number: 8972259
    Abstract: A method and system for teaching non-lexical speech effects includes delexicalizing a first speech segment to provide a first prosodic speech signal and data indicative of the first prosodic speech signal is stored in a computer memory. The first speech segment is audibly played to a language student and the student is prompted to recite the speech segment. The speech uttered by the student in response to the prompt, is recorded.
    Type: Grant
    Filed: September 9, 2010
    Date of Patent: March 3, 2015
    Assignee: Rosetta Stone, Ltd.
    Inventors: Joseph Tepperman, Theban Stanley, Kadri Hacioglu
  • Publication number: 20150037765
    Abstract: A system and method for distributing and analyzing a set of tests includes a network, a test system, a manager, and a set of users connected to the network. The method includes the steps of receiving a set of challenges, a set of predetermined responses, and a set of parameters, generating a test message, sending the test message to each user, sending the set of challenges and the set of predetermined answers in response to the test message, receiving a set of audio responses, a set of text responses, a set of video responses, and a set of selected responses from the set of predetermined responses, analyzing the set of audio responses, the set of text responses, the set of video responses, and the set of selected responses, and calculating a set of scores.
    Type: Application
    Filed: August 1, 2014
    Publication date: February 5, 2015
    Applicant: SPEETRA, INC.
    Inventors: Pawan Jaggi, Abhijeet Sangwan
  • Patent number: 8949129
    Abstract: A method and apparatus are provided for processing a set of communicated signals associated with a set of muscles, such as the muscles near the larynx of the person, or any other muscles the person use to achieve a desired response. The method includes the steps of attaching a single integrated sensor, for example, near the throat of the person proximate to the larynx and detecting an electrical signal through the sensor. The method further includes the steps of extracting features from the detected electrical signal and continuously transforming them into speech sounds without the need for further modulation. The method also includes comparing the extracted features to a set of prototype features and selecting a prototype feature of the set of prototype features providing a smallest relative difference.
    Type: Grant
    Filed: August 12, 2013
    Date of Patent: February 3, 2015
    Assignee: Ambient Corporation
    Inventors: Michael Callahan, Thomas Coleman
  • Publication number: 20150004571
    Abstract: Non-limiting embodiments of systems and apparatuses are provided in which one or more processors are communicatively coupled to a network. The one or more processors are configured to present to a first user a plurality of exemplary recorded presentations, record a presentation by the first user, and present to a second user the recorded presentation by the first user. The one or more processors are also configured to receive feedback from the second user on the recorded presentation by the first user and present the feedback to the first user. Non-limiting embodiments of processor-readable non-transitory storage mediums and methods performed by the systems, apparatuses, and mediums are also provided.
    Type: Application
    Filed: July 1, 2014
    Publication date: January 1, 2015
    Inventors: Paul IRONSIDE, Nathan SMALLEY, Jonathan PALAY
  • Patent number: 8918718
    Abstract: A system for enhancing reading performance operates on a network-connected server with software executing from a non-transitory medium at the server providing an interactive interface for a user connected to the server via a browser link. There is a data repository coupled to the server. The interactive interface provides a word search exercise for the user for improving the user's reading performance, displays a passage comprising a first number of words and a search list with a second number of words that each appear at least once in the passage, the second number smaller than the first number, and when the user clicks on every word in the passage for a word that appears in the search list, that word is indicated in the list as found, until all the words in the search list have been indicated as found.
    Type: Grant
    Filed: February 27, 2012
    Date of Patent: December 23, 2014
    Assignee: John Burgess Reading Performance System
    Inventor: John Burgess
  • Publication number: 20140356822
    Abstract: In exemplary implementations of this invention, a display screen and speakers present an audiovisual display of an animated character to a human user during a conversational period of a coaching session. The virtual character asks questions, listens to the user, and engages in mirroring and backchanneling. A camera and microphone gather audiovisual data regarding behavior of the user. After the conversational period, the display screen and speakers display feedback to the user regarding the user's behavior. For example, the feedback may include a plot of the user's smiles over time, or information regarding prosody of the user's speech. The feedback may also include playing a video of the user that was recorded during the conversational period. The feedback may also include a timeline of the human user's behavior. The virtual coaching may be provided over the Internet.
    Type: Application
    Filed: June 3, 2014
    Publication date: December 4, 2014
    Applicant: MASSACHUSETTS INSTITUTE OF TECHNOLOGY
    Inventors: Mohammed Ehasanul Hoque, Rosalind Picard
  • Publication number: 20140342324
    Abstract: A real-time wireless system for recording natural tongue movements in the 3D oral space. By attaching a small magnetic tracer to the tongue, either temporarily or semi-permanently, and placing an array of magnetic sensors around the mouth, the tracer can be localized with sub-millimeter precision. The system can also combine the tracer localization with acoustic, video, and flow data via additional sensors to form a comprehensive audiovisual biofeedback mechanism for diagnosing speech impairments and improving speech therapy. Additionally, the system can record tongue trajectories and create an indexed library of such traces. The indexed library can be used as a tongue tracking silent speech interface. The library can synthesize words, phrases, or execute commands tied to the individual patterns of magnetic field variations or tongue trajectories.
    Type: Application
    Filed: May 19, 2014
    Publication date: November 20, 2014
    Applicant: Georgia Tech Research Corporation
    Inventors: Maysam Ghovanloo, Jacob Block
  • Patent number: 8888494
    Abstract: One or more embodiments present a script to a user in an interactive script environment. A digital representation of a manuscript is analyzed. This digital representation includes a set of roles and a set of information associated with each role in the set of roles. An active role in the set of roles that is associated with a given user is identified based on the analyzing. At least a portion of the manuscript is presented to the given user via a user interface. The portion includes at least a subset of information in the set of information. Information within the set of information that is associated with the active role is presented in a visually different manner than information within the set of information that is associated with a non-active role, which is a role that is associated with a user other than the given user.
    Type: Grant
    Filed: June 27, 2011
    Date of Patent: November 18, 2014
    Inventor: Randall Lee Threewits
  • Patent number: 8851895
    Abstract: A method and system for teaching and practicing articulation of targeted phonemes for use in the treatment of various speech sound disorders including problems with articulation, phonological processes, voice and resonance. A student chooses cards intentionally or randomly from a deck and places them on spaces provided in a story template. The cards include words or images that contain the phoneme to be practiced and which are appropriate for completing the story. The student then reads the story orally or just the words on the cards inserted into the spaces thereby drilling on the phoneme to be practiced. The student also has an opportunity to use learned phonemes in conversational speech in response to questions.
    Type: Grant
    Filed: January 29, 2008
    Date of Patent: October 7, 2014
    Inventor: Elizabeth M. Morrison
  • Publication number: 20140287385
    Abstract: This invention relates an enunciation device comprising a hollow device shaped in a manner to fit a mouth. This enunciation device assists with improving enunciation, slowing speech, and strengthens mouth, tongue, and jaw muscles so that a user can build confidence while speaking.
    Type: Application
    Filed: November 21, 2012
    Publication date: September 25, 2014
    Inventor: Daniel G. Floyd
  • Publication number: 20140272827
    Abstract: Various of the disclosed embodiments relate to systems and methods for managing a vocal performance. In some embodiments, a central hosting server may maintain a repository of speech text, waveforms, and metadata supplied by a plurality of development team members. The central hosting server may facilitate modification of the metadata and collaborative commentary procedures so that the development team members may generate higher quality voice assets more efficiently.
    Type: Application
    Filed: March 14, 2013
    Publication date: September 18, 2014
    Applicant: TOYTALK, INC.
    Inventors: Oren M. Jacobs, Martin Reddy, Lucas R.A. Ives
  • Publication number: 20140248592
    Abstract: The present invention is directed to methods and devices for teaching the proper configuration of the oral articulators, particularly the tongue, corresponding to particular speech sounds by providing intraoral tactile feedback. Intraoral tactile feedback is achieved by placing nodes in the oral cavity of the patient in locations corresponding to the proper lingual position required to produce a target sound. These nodes facilitate identification of the appropriate lingual position corresponding to a target speech sound by providing tactile differentiation when the target sound is properly produced.
    Type: Application
    Filed: May 9, 2014
    Publication date: September 4, 2014
    Applicant: ARTICULATE TECHNOLOGIES, INC.
    Inventors: David A. PENAKE, Alexey SALAMINI, Gordy ROGERS, Joe WATSON
  • Patent number: 8825492
    Abstract: The language-based video game places a player avatar into a game environment contained within a display field following a story narrative or an adventure for completing an objective. The gameplay reinforces pronunciation and writing of a given language. The display field includes a minor head graphic, as can be highlighted text in the given language, interactive text objects, and can include a control icon and a progress icon. The minor head graphic is a representation of a human head, or portion thereof, animated to show pronunciation of the highlighted text. As the player progresses through the game, the player encounters the interactive text objects that, upon activation, transform into useful objects for overcoming challenges present in the game environment, the interactive text object being the same as, substantially the same as, or corresponding to the highlighted text. Avatar movement and interactions are controlled through a control scheme via an interface.
    Type: Grant
    Filed: October 28, 2013
    Date of Patent: September 2, 2014
    Inventor: Yousef A. E. S. M. Buhadi
  • Publication number: 20140234811
    Abstract: A method of supporting vocabulary and language learning by positioning at least one microphone so as to capture speech in the listening environment of a learner. The microphone is monitored to develop a speech signal. The speech signal is analyzed to determine at least one characteristic of the speech or vocalization, wherein the characteristic indicates a qualitative or quantitative feature of the speech. The determined characteristic is compared to a preselected standard or such characteristic is tracked to show growth over time and the comparison or growth is reported to the person associated with the speech signal or person who potentially can affect the language environment of the learner.
    Type: Application
    Filed: April 28, 2014
    Publication date: August 21, 2014
    Applicant: LENA Foundation
    Inventor: Terrance D. Paul
  • Patent number: 8805673
    Abstract: Systems and methods include transmitting from a server to a client device a list of common phrases of a language and voice recordings associated with each of the phrase, wherein voice recordings provide region-specific pronunciations of the phrases. Users at the client device can search over communication network for common phrases and listen to how certain phrases of a language are spoken in different regions of the world. The users at the client device can also upload to the server their own voice recordings of phrases in their own region-specific pronunciations. Using the present systems and methods, the users can familiarize themselves with how a particular language, such as English, is spoken in different regions of the world prior to their international travel or business meeting.
    Type: Grant
    Filed: July 14, 2011
    Date of Patent: August 12, 2014
    Assignee: GlobalEnglish Corporation
    Inventor: Sam Neff
  • Publication number: 20140220520
    Abstract: An intraoral method, biofeedback system and kit are provided for supplying intraoral feedback representative of a speaker's pronunciation during sound production, which feedback may be used for training and enhancing a speaker's pronunciation accuracy.
    Type: Application
    Filed: September 7, 2012
    Publication date: August 7, 2014
    Applicant: Articulate Technologies Inc.
    Inventors: Alexey Salamini, Adrienne E. Penake, David A. Penake, Gordy T. Rogers
  • Patent number: 8798986
    Abstract: Portable, real time voice translation systems, and associated methods of use, are provided. The systems include a translation system for use on a single unit, portable computing device and operable for accessing a multilanguage database, selecting a source language from a plurality of source languages and a destination language from a plurality of destination languages, inputting a source phrase, transmitting the source phrase to a speech recognition module, a translation engine, and a template look-up engine for finding the phrase template associated with the destination phrase from among the multiple languages. The spoken translation is then output in the selected destination language. The translation system has a total time between the input of the source phrase and output of the destination phrase that is no slower than 0.010 seconds, and a communications interface operable for communicating with a second computer system.
    Type: Grant
    Filed: December 26, 2012
    Date of Patent: August 5, 2014
    Assignee: NewTalk, Inc.
    Inventors: Robert H. Clemons, Bruce W. Nash, Martha P. Robinson, Craig A. Robinson
  • Patent number: 8775363
    Abstract: A system and method of processing one or more sensor logs includes receiving a sensor log and identifying a set of entries in the sensor log having a predefined sequence of sensor identifiers. The set of entries may define a velocity event. The method can also provide for calculating an in-home gait velocity for the velocity event. In one example, the method also provides for identifying another set of entries in the sensor log having a sensor identifier that corresponds to a dwell sensor mounted in a doorway, wherein the other set of entries define a dwell event. The method may also provide for calculating an in-home dwell time for the dwell event.
    Type: Grant
    Filed: December 21, 2009
    Date of Patent: July 8, 2014
    Assignee: Intel-GE Care Innovations LLC
    Inventors: Barry R. Greene, Adrian Burns
  • Patent number: 8775184
    Abstract: Techniques for evaluating one or more spoken language skills of a speaker are provided. The techniques include identifying one or more temporal locations of interest in a speech passage spoken by a speaker, computing one or more acoustic parameters, wherein the one or more acoustic parameters capture one or more properties of one or more acoustic-phonetic features of the one or more locations of interest, and combining the one or more acoustic parameters with an output of an automatic speech recognizer to modify an output of a spoken language skill evaluation.
    Type: Grant
    Filed: January 16, 2009
    Date of Patent: July 8, 2014
    Assignee: International Business Machines Corporation
    Inventors: Om D. Deshmukh, Ashish Verma
  • Patent number: 8768697
    Abstract: In some embodiments, a method includes measuring a disparity between two speech samples by segmenting both a reference speech sample and a student speech sample into speech units. A duration disparity can be determined for units that are not adjacent to each other in the reference speech sample. A duration disparity can also be determined for the corresponding units in the student speech sample. A difference can then be calculated between the student speech sample duration disparity and the reference speech sample duration disparity.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: July 1, 2014
    Assignee: Rosetta Stone, Ltd.
    Inventors: Joseph Tepperman, Theban Stanley, Kadri Hacioglu
  • Patent number: 8740622
    Abstract: The present invention is directed to methods and devices for teaching the proper configuration of the oral articulators, particularly the tongue, corresponding to particular speech sounds by providing intraoral tactile feedback. Intraoral tactile feedback is achieved by placing nodes in the oral cavity of the patient in locations corresponding to the proper lingual position required to produce a target sound. These nodes facilitate identification of the appropriate lingual position corresponding to a target speech sound by providing tactile differentiation when the target sound is properly produced.
    Type: Grant
    Filed: January 21, 2009
    Date of Patent: June 3, 2014
    Assignee: Articulate Technologies, Inc.
    Inventors: David A. Penake, Alexey Salamini, Gordy Rogers, Joe Watson
  • Patent number: 8744856
    Abstract: A computer implemented method, system and computer program product for evaluating pronunciation. Known phonemes are stored in a computer memory. A spoken utterance corresponding to a target utterance, comprised of a sequence of target phonemes, is received and stored in a computer memory. The spoken utterance is segmented into a sequence of spoken phonemes, each corresponding to a target phoneme. For each spoken phoneme, a relative posterior probability is calculated that the spoken phoneme is the corresponding target phoneme. If the calculated probability is greater than a first threshold, an indication that the target phoneme was pronounced correctly is output; if less than a first threshold, an indication that the target phoneme was pronounced incorrectly is output. If the probability is less than a first threshold and greater than a second threshold, an indication that pronunciation of the target phoneme was acceptable is output.
    Type: Grant
    Filed: February 21, 2012
    Date of Patent: June 3, 2014
    Assignee: Carnegie Speech Company
    Inventor: Mosur K. Ravishankar
  • Patent number: 8744847
    Abstract: Certain aspects and embodiments of the present invention are directed to systems and methods for monitoring and analyzing the language environment and the development of a key child. A key child's language environment and language development can be monitored without placing artificial limitations on the key child's activities or requiring a third party observer. The language environment can be analyzed to identify phones or speech sounds spoken by the key child, independent of content. The number and type of phones is analyzed to automatically assess the key child's expressive language development. The assessment can result in a standard score, an estimated developmental age, or an estimated mean length of utterance.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: June 3, 2014
    Assignee: LENA Foundation
    Inventors: Terrance Paul, Dongxin Xu, Jeffrey A. Richards
  • Publication number: 20140141393
    Abstract: This invention relates an enunciation device comprising a hollow device shaped in a manner to fit a mouth. This enunciation device assists with improving enunciation, slowing speech, and strengthens mouth, tongue, and jaw muscles so that a user can build confidence while speaking.
    Type: Application
    Filed: November 21, 2012
    Publication date: May 22, 2014
    Inventor: Daniel G. Floyd
  • Patent number: 8719035
    Abstract: Techniques are disclosed for recognizing user personality in accordance with a speech recognition system. For example, a technique for recognizing a personality trait associated with a user interacting with a speech recognition system includes the following steps/operations. One or more decoded spoken utterances of the user are obtained. The one or more decoded spoken utterances are generated by the speech recognition system. The one or more decoded spoken utterances are analyzed to determine one or more linguistic attributes (morphological and syntactic filters) that are associated with the one or more decoded spoken utterances. The personality trait associated with the user is then determined based on the analyzing step/operation.
    Type: Grant
    Filed: March 26, 2008
    Date of Patent: May 6, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Osamuyimen Thompson Stewart, Liwei Dai
  • Patent number: 8719019
    Abstract: Speaker identification techniques are described. In one or more implementations, sample data is received at a computing device of one or more user utterances captured using a microphone. The sample data is processed by the computing device to identify a speaker of the one or more user utterances. The processing involving use of a feature set that includes features obtained using a filterbank having filters that space linearly at higher frequencies and logarithmically at lower frequencies, respectively, features that model the speaker's vocal tract transfer function, and features that indicate a vibration rate of vocal folds of the speaker of the sample data.
    Type: Grant
    Filed: April 25, 2011
    Date of Patent: May 6, 2014
    Assignee: Microsoft Corporation
    Inventors: Hoang T. Do, Ivan J. Tashev, Alejandro Acero, Jason S. Flaks, Robert N. Heitkamp, Molly R. Suver
  • Patent number: 8712773
    Abstract: The present invention relates to a method for modeling a common-language speech recognition, by a computer, under the influence of multiple dialects and concerns a technical field of speech recognition by a computer. In this method, a triphone standard common-language model is first generated based on training data of standard common language, and first and second monophone dialectal-accented common-language models are based on development data of dialectal-accented common languages of first kind and second kind, respectively. Then a temporary merged model is obtained in a manner that the first dialectal-accented common-language model is merged into the standard common-language model according to a first confusion matrix obtained by recognizing the development data of first dialectal-accented common language using the standard common-language model.
    Type: Grant
    Filed: October 29, 2009
    Date of Patent: April 29, 2014
    Assignees: Sony Computer Entertainment Inc., Tsinghua University
    Inventors: Fang Zheng, Xi Xiao, Linquan Liu, Zhan You, Wenxiao Cao, Makoto Akabane, Ruxin Chen, Yoshikazu Takahashi
  • Patent number: 8660850
    Abstract: A method for editing timed and annotated data includes acquiring a multimedia data stream; performing a decoding operation upon the multimedia data stream, wherein the decoded data stream comprises a textual data stream; synchronizing the multimedia data stream and the decoded data stream by performing a time stamping operation upon the data streams; editing the decoded data stream; and realigning the time stamp data of the edited decoded data stream in order to synchronize the edited decoded data with the multimedia data stream.
    Type: Grant
    Filed: February 13, 2012
    Date of Patent: February 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Alexander Faisman, Grabarnik Genady, Dimitri Kanevsky, Larisa Shwartz
  • Publication number: 20140051042
    Abstract: According to one embodiment, a speech learning apparatus includes a detection unit, a first calculation unit, a generation unit, an addition unit and a speech synthesis unit. The first calculation unit calculates a score indicating a degree of emphasis of a keyword based on a type of the marker and a manner of selecting the keyword. The generation unit generates a synthesis parameter to determine a degree of reading of the keyword in accordance with the score and the type of the marker. The addition unit adds to the keyword a tag for reading the keyword in accordance with the synthesis parameter. The speech synthesis unit generates synthesized speech obtained by synthesizing speech of the keyword in accordance with the tag.
    Type: Application
    Filed: August 14, 2013
    Publication date: February 20, 2014
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kouichirou MORI, Masahiro MORITA
  • Patent number: 8632341
    Abstract: A product and method for providing a read along educational activity to a user are provided. The product is a single data storage medium containing both book information and associated, coordinated audio narration information. The book information comprises visual images of a book, including pages of a book, moving images of the pages being turned, text on the pages of the book, and pictures or illustrations related to the book or story. The user views the book information on a video screen, as if it were an actual book, while listening to the associated and coordinated audio narration. The product also comprises supplemental features to enhance the read along activity and assist in learning.
    Type: Grant
    Filed: February 21, 2003
    Date of Patent: January 21, 2014
    Assignee: Disney Enterprises, Inc.
    Inventor: Luigi-Theo Calabrese
  • Publication number: 20140004486
    Abstract: Devices, systems, and methods for enriching communications may include communications circuitry configured to process one or more verbal communications signals being transmitted between a computing device and a remote computing device, the one or more verbal communications signals relating to a conversation between a user of the computing device and a user of the remote computing device, a conversation dynamics engine configured to generate at least one suggested conversation topic by analyzing the one or more verbal communications signals, and a display configured to present the at least one suggested conversation topic to the user of the computing device.
    Type: Application
    Filed: June 27, 2012
    Publication date: January 2, 2014
    Inventors: Richard P. Crawford, Margaret E. Morris, Muki Hasteen-Izora, Nancy Vuckovic
  • Patent number: 8596640
    Abstract: A method of play, wherein play emphasizes storytelling and story recounting abilities, and wherein the method comprises: (a) providing a gaining device which provides for: a plurality of subject elements, wherein each subject element from the plurality of subject elements comprises a topic for a particular story; and a plurality of situational elements, wherein each situational element from the plurality of situational elements qualifies the subject element; (b) pairing a subject element from the plurality of subject elements with a situational element from the plurality of situational elements; and (c) providing a story based on the paired subject element and selected situational element.
    Type: Grant
    Filed: October 31, 2012
    Date of Patent: December 3, 2013
    Inventor: Jacob G. R. Kramlich
  • Patent number: 8570439
    Abstract: A broadcasting processing apparatus and a control method thereof are provided. The apparatus includes a signal receiver which receives an image signal having image content; a user selection unit which selects movie content from the image content; a storage unit; and a controller which determines whether a received image signal is a film image signal if the movie content is selected through the user selection unit, and stores the image signal in the storage unit if the image signal corresponds to the film image signal.
    Type: Grant
    Filed: February 7, 2008
    Date of Patent: October 29, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: An-na Lee
  • Patent number: 8521529
    Abstract: An input signal is converted to a feature-space representation. The feature-space representation is projected onto a discriminant subspace using a linear discriminant analysis transform to enhance the separation of feature clusters. Dynamic programming is used to find global changes to derive optimal cluster boundaries. The cluster boundaries are used to identify the segments of the audio signal.
    Type: Grant
    Filed: April 18, 2005
    Date of Patent: August 27, 2013
    Assignee: Creative Technology Ltd
    Inventors: Michael M. Goodwin, Jean Laroche
  • Patent number: 8489397
    Abstract: A machine-readable medium and a network device are provided for speech-to-text translation. Speech packets are received at a broadband telephony interface and stored in a buffer. The speech packets are processed and textual representations thereof are displayed as words on a display device. Speech processing is activated and deactivated in response to a command from a subscriber.
    Type: Grant
    Filed: September 11, 2012
    Date of Patent: July 16, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Charles David Caldwell, John Bruce Harlow, Robert J. Sayko, Norman Shaye
  • Patent number: 8480401
    Abstract: Acoustical voice-feedback systems include headsets connected through a connector or central hub for vocalizing and sound formation assistance. A user's voice is conveyed directly into his or her own ear as well as the ears of each connected user. A central hub allows selective connection of one or more headsets to each other, or allows an instructor to selectively connect an instructor headset to one or more instructee headsets.
    Type: Grant
    Filed: March 2, 2010
    Date of Patent: July 9, 2013
    Assignee: Harebrain, Inc.
    Inventors: Steven Swain, Jeffrey Waffensmith
  • Patent number: 8447604
    Abstract: Provided in some embodiments is a method including receiving ordered script words are indicative of dialogue words to be spoken, receiving audio data corresponding to at least a portion of the dialogue words to be spoken and including timecodes associated with dialogue words, generating a matrix of the ordered script words versus the dialogue words, aligning the matrix to determine hard alignment points that include matching consecutive sequences of ordered script words with corresponding sequences of dialogue words, partitioning the matrix of ordered script words into sub-matrices bounded by adjacent hard-alignment points and including corresponding sub-sets the script and dialogue words between the hard-alignment points, and aligning each of the sub-matrices.
    Type: Grant
    Filed: May 28, 2010
    Date of Patent: May 21, 2013
    Assignee: Adobe Systems Incorporated
    Inventor: Walter W. Chang
  • Patent number: 8374876
    Abstract: A system and a method for speech generation which assist the speech of those with a disability or a medical condition such as cerebral palsy, motor neurone disease or a dysarthia following a stroke. The system has a user interface having a multiplicity of states each of which correspond to a sound and a selector for making a selection of a state or a combination of states. The system also has a processor for processing the selected state or combination of states and an audio output for outputting the sound or combination of sounds. The sounds associated with the states can be phonemes or phonics and the user interface is typically a manually operable device such as a mouse, trackball, joystick or other device that allows a user to distinguish between states by manipulating the interface to a number of positions.
    Type: Grant
    Filed: February 1, 2007
    Date of Patent: February 12, 2013
    Assignee: The University of Dundee
    Inventors: Rolf Black, Annula Waller, Eric Abel, Iain Murray, Graham Pullin
  • Patent number: 8364466
    Abstract: The teachings described herein generally relate to a multilingual electronic translation of a source phrase to a destination language selected from multiple languages, and this can be accomplished through the use of a network environment. The electronic translation can occur as a spoken translation, can be in real-time, and can mimic the voice of the user of the system.
    Type: Grant
    Filed: June 16, 2012
    Date of Patent: January 29, 2013
    Assignee: NewTalk, Inc.
    Inventors: Bruce W. Nash, Craig A. Robinson, Martha P. Robinson, Robert H. Clemons
  • Publication number: 20120322035
    Abstract: Speech data from the operation of a speech recognition application is recorded over the course of one or more language learning sessions. The operation of the speech recognition application during each language learning sessions corresponds to a user speaking, and the speech recognition application generating text data. The text data may a recognition of what the user spoke. The speech data may comprise the text data, and confidence values that are an indication of an accuracy of the recognition. The speech data from each language learning session may be analyzed to determine an overall performance level of the user.
    Type: Application
    Filed: August 28, 2012
    Publication date: December 20, 2012
    Inventors: Luc Julia, Jerome Dubreuil, Jehen Bing