Time Element Patents (Class 704/267)
  • Patent number: 11848002
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech synthesis. The methods, systems, and apparatus include actions of obtaining an audio representation of speech of a target speaker, obtaining input text for which speech is to be synthesized in a voice of the target speaker, generating a speaker vector by providing the audio representation to a speaker encoder engine that is trained to distinguish speakers from one another, generating an audio representation of the input text spoken in the voice of the target speaker by providing the input text and the speaker vector to a spectrogram generation engine that is trained using voices of reference speakers to generate audio representations, and providing the audio representation of the input text spoken in the voice of the target speaker for output.
    Type: Grant
    Filed: July 19, 2022
    Date of Patent: December 19, 2023
    Assignee: Google LLC
    Inventors: Ye Jia, Zhifeng Chen, Yonghui Wu, Jonathan Shen, Ruoming Pang, Ron J. Weiss, Ignacio Lopez Moreno, Fei Ren, Yu Zhang, Quan Wang, Patrick An Phu Nguyen
  • Patent number: 11741942
    Abstract: A method, computer program product, and computer system for text-to-speech synthesis is disclosed. Synthetic speech data for an input text may be generated. The synthetic speech data may be compared to recorded reference speech data corresponding to the input text. Based on, at least in part, the comparison of the synthetic speech data to the recorded reference speech data, at least one feature indicative of at least one difference between the synthetic speech data and the recorded reference speech data may be extracted. A speech gap filling model may be generated based on, at least in part, the at least one feature extracted. A speech output may be generated based on, at least in part, the speech gap filling model.
    Type: Grant
    Filed: August 3, 2022
    Date of Patent: August 29, 2023
    Assignee: Telepathy Labs, Inc
    Inventors: Piero Perucci, Martin Reber, Vijeta Avijeet
  • Patent number: 11302301
    Abstract: A method, computer program, and computer system is provided for synthesizing speech at one or more speeds. A context associated with one or more phonemes corresponding to a speaking voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a voice sample corresponding to the speaking voice is synthesized using the generated mel-spectrogram features.
    Type: Grant
    Filed: March 3, 2020
    Date of Patent: April 12, 2022
    Assignee: TENCENT AMERICA LLC
    Inventors: Chengzhu Yu, Dong Yu
  • Patent number: 11243254
    Abstract: A method for operating a test apparatus including a plurality of shared resources is shown, wherein the plurality of shared resources can be used in different instruments. The method includes blocking a first set of resource blockers when a first instrument, which requires a first subset of the shared resources, is to be executed. Furthermore, the method tries to block a second set of resource blockers, when a second instrument, which requires a second subset of the shared resources, is to be executed. Therefore, the first set of resource blockers is different from the second set of resource blockers and a plurality of resource blockers are assigned to a shared resource, which is involved in a conflicting combination of instruments and in a non-conflicting combination of instruments.
    Type: Grant
    Filed: September 27, 2017
    Date of Patent: February 8, 2022
    Assignee: ADVANTEST CORPORATION
    Inventor: Wolfgang Horn
  • Patent number: 10832031
    Abstract: A first set of signals corresponding to a first signal modality (such as the direction of a gaze) during a time interval is collected from an individual. A second set of signals corresponding to a different signal modality (such as hand-pointing gestures made by the individual) is also collected. In response to a command, where the command does not identify a particular object to which the command is directed, the first and second set of signals is used to identify candidate objects of interest, and an operation associated with a selected object from the candidates is performed.
    Type: Grant
    Filed: August 14, 2017
    Date of Patent: November 10, 2020
    Assignee: Apple Inc.
    Inventors: Wolf Kienzle, Douglas A. Bowman
  • Patent number: 10810313
    Abstract: A system and method for preserving the privacy of data while processing of the data in a cloud. The system comprises a computer program application and a client encryption key, The system is operable to encrypt the computer program application and data using the client encryption key; upload the encrypted computer program application and encrypted data in the cloud; enable the computer platform to undertake processing of the encrypted data in the cloud using the encrypted computer program application; output encrypted processing results; and, enable decryption of the encrypted processing results using the client encryption key.
    Type: Grant
    Filed: October 3, 2016
    Date of Patent: October 20, 2020
    Inventors: Nigel Henry Cannings, Gerard Chollet, Cornelius Glackin, Muttukrishnan Rajarajan
  • Patent number: 10685644
    Abstract: There is disclosed a method of generating a text-to-speech (TTS) training set for training a Machine Learning Algorithm (MLA) for generating machine-spoken utterances The method is executable by a server. The method includes generating a synthetic word based on merging separate phonemes from each of two words of a corpus of pre-recorded utterances, the merging being done using the common phoneme as a merging anchor, the merging resulting in at least two synthetic words. The synthetic words and assessor labels are used to train a classifier to predict a quality parameter associated with a new synthetic phonemes-based word, the quality parameter being representative of whether the new synthetic phonemes-based word is naturally sounding (based on acoustic features of generated synthetic words utterances). The classifier is then used to generate training objects for the MLA and to use the MLA to process the corpus of pre-recorded utterances into their respective vectors.
    Type: Grant
    Filed: July 4, 2018
    Date of Patent: June 16, 2020
    Assignee: YANDEX EUROPE AG
    Inventors: Vladimir Vladimirovich Kirichenko, Petr Vladislavovich Luferenko
  • Patent number: 10650810
    Abstract: Systems and methods of determining phonetic relationships are provided. For instance data indicative of an input text phrase input by a user can be received. An audio output corresponding to a spoken rendering of the input text phrase can be determined. A text transcription of the audio output of the input text phrase can be determined. The text transcription can be a textual representation of the audio output. The text transcription can be compared against a plurality of test phrases to identify a match between the text transcription and at least one test phrase.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: May 12, 2020
    Assignee: GOOGLE LLC
    Inventors: Nikhil Chandru Rao, Saisuresh Krishnakumaran
  • Patent number: 10504502
    Abstract: A sound control device includes: a detection unit that detects a first operation on an operator and a second operation on the operator, the second operation being performed after the first operation; and a control unit that causes output of a second sound to be started, in response to the second operation being detected. The control unit causes output of a first sound to be started before causing the output of the second sound to be started, in response to the first operation being detected.
    Type: Grant
    Filed: September 20, 2017
    Date of Patent: December 10, 2019
    Assignee: YAMAHA CORPORATION
    Inventors: Keizo Hamano, Yoshitomo Ota, Kazuki Kashiwase
  • Patent number: 10469623
    Abstract: A system and method for multi-language phrase identification within spoken interaction audio capable of adjusting for regional pronunciation (accents), cadence differences, and homologs. In this system, a spoken interaction audio data store supplies spoken audio data such as contact center call recordings to be analyzed for a specific phrase or set of phrases. Phrases are entered as natural language text and converted to the phonemes representative of the phrase audio using the invention's language packs and stored in a data store. Spoken interaction and phrase audio are converted to a digital format allowing comparison using multiple characteristics. Phrase matches are stored for subsequent post analysis display and analytics generation.
    Type: Grant
    Filed: November 27, 2018
    Date of Patent: November 5, 2019
    Assignee: ZOOM International a.s.
    Inventor: Vaclav Slovacek
  • Patent number: 10395638
    Abstract: An apparatus and a computer program product for merging incoming alerts for accessibility are described. Two input alerts intended for presentation by a screen reader are received. If the two input alerts have arrived with a specified time interval, the two input alerts are combined into an output alert. The output alert is sent to a screen reader for presentation.
    Type: Grant
    Filed: July 8, 2017
    Date of Patent: August 27, 2019
    Assignee: International Business Machines Corporation
    Inventors: Stephen A Boxwell, Kyle M Brake, Keith G Frost, Stanley J Vernier
  • Patent number: 10373595
    Abstract: A musical sound generation device including a first memory having a plurality of waveform data, a second memory which stores waveform data read out from the first memory, and a control processor which controls such that, when a sound emission instruction is provided and specified waveform data is in the second memory, the waveform data is read out by the sound source processor, or controls such that, when a sound emission instruction is provided and specified waveform data is not in the second memory, the specified waveform data is transferred from the first memory to the second memory and read out by the sound source processor, in which the control processor controls such that waveform data satisfying a set condition is not subjected to a waveform data change by the transfer and waveform data not satisfying the set condition is subjected to the waveform data change by the transfer.
    Type: Grant
    Filed: March 2, 2018
    Date of Patent: August 6, 2019
    Assignee: CASIO COMPUTER CO., LTD.
    Inventors: Hiroki Sato, Hajime Kawashima
  • Patent number: 10205587
    Abstract: Provided is a wireless communication system which enables wireless communication in which crosstalk due to multiple access is canceled while using a large number of inexpensive wireless terminals. In order to generate an interference component, an analysis data sequence is generated by applying a Hilbert transform, by a Hilbert transform, to a subcarrier data sequence obtained by extracting a target subcarrier wave component from a finite length data sequence, while a carrier phase difference ? is estimated by using the regression analysis by a carrier wave phase estimation unit. After rotation calculation configured to return the analysis data sequence by the carrier phase difference ? is performed, conversion into an angle is performed. Further, a multiplication by a desired odd number of multiplication is performed, and then an inverse Hilbert transform is applied.
    Type: Grant
    Filed: April 24, 2017
    Date of Patent: February 12, 2019
    Assignee: KYOWA ELECTRONIC INSTRUMENTS CO., LTD.
    Inventors: Jin Mitsugi, Yuki Igarashi, Haruhisa Ichikawa, Yuusuke Kawakita, Kiyoshi Egawa
  • Patent number: 10134385
    Abstract: Systems and methods are provided for associating a phonetic pronunciation with a name by receiving the name, mapping the name to a plurality of monosyllabic components that are combinable to construct the phonetic pronunciation of the name, receiving a user input to select one or more of the plurality, and combining the selected one or more of the plurality of monosyllabic components to construct the phonetic pronunciation of the name.
    Type: Grant
    Filed: March 2, 2012
    Date of Patent: November 20, 2018
    Assignee: Apple Inc.
    Inventor: Devang K. Naik
  • Patent number: 10019982
    Abstract: A speech simulation system adapted for a user to communicate with others. The system has at least one sensor to sense controlled and coordinated body movement. The system has a computer processor connected to the at least one sensor. The system has a database memory connected to the computer processor. The system has software programming to operate the computer processor. The system has a feedback device connected to the computer processor and directed to the user. The system has an outward audio output device connected to the computer processor to provide sound and a speaker connected to the outward audio output device.
    Type: Grant
    Filed: October 18, 2016
    Date of Patent: July 10, 2018
    Inventor: Mary Elizabeth McCulloch
  • Patent number: 10019069
    Abstract: A vehicular display input apparatus includes a gesture detection unit, a determiner, and a controller. The gesture detection unit detects a gesture made by a hand of the driver. The determiner determines whether a visual line of the driver is directed within a visual line detection area, which is preliminarily defined to include at least partial display region. The controller switches to one of operations listed in an operation menu, which is to be correlated with the gesture, according to a determination result of the determiner. The determination result indicates whether the visual line is directed within the visual line detection area.
    Type: Grant
    Filed: March 12, 2015
    Date of Patent: July 10, 2018
    Assignee: DENSO CORPORATION
    Inventor: Youichi Naruse
  • Patent number: 10008216
    Abstract: Method and apparatus for reducing a size of databases required for recorded speech data.
    Type: Grant
    Filed: April 15, 2014
    Date of Patent: June 26, 2018
    Assignee: SPEECH MORPHING SYSTEMS, INC.
    Inventors: Fathy Yassa, Benjamin Reaves, Steve Pearson
  • Patent number: 9924282
    Abstract: A system for improving synchronization of an acoustic signal to a video display includes a hearing aid comprising a hearing loss processor configured for signal processing in accordance with a hearing loss of a user of the hearing aid, the hearing aid being configured for receiving a first audio signal for synchronous presentation to the user viewing the video display, the hearing aid being configured for generating a first acoustic signal to be presented to the user of the hearing aid, the first acoustic signal comprising at least a first part being generated in response to the first audio signal. The system also includes a delay unit configured for applying a delay, such that synchronization of the at least first part of the first acoustic signal to the video display is improved.
    Type: Grant
    Filed: January 20, 2012
    Date of Patent: March 20, 2018
    Assignee: GN RESOUND A/S
    Inventors: Søren C. Ell, Jesper L. Nielsen
  • Patent number: 9900115
    Abstract: Systems and methods of voice annunciation of signal strength, quality of service, and sensor status for wireless devices are provided. Some methods can include determining a signal strength or range of a radio, determining quality of service events and statistics for a wireless device, or determining a status of a sensor and then verbally annunciating information or instructions relating to the determined signal strength or range, the determined quality of service events and statistics, or the determined sensor status.
    Type: Grant
    Filed: February 20, 2015
    Date of Patent: February 20, 2018
    Assignee: HONEYWELL INTERNATIONAL INC.
    Inventors: Timothy A. Rauworth, Douglas L. Hoeferle, Robert J. Selepa, Pardeep Verma
  • Patent number: 9640172
    Abstract: A sound synthesizing apparatus includes a waveform storing section which stores a plurality of unit waveforms extracted from different positions, on a time axis, of a sound waveform indicating a voiced sound, and a waveform generating section which generates, for each of a first processing period and a second processing period, a synthesized waveform by arranging the plurality of unit waveforms on the time axis, wherein the second processing period is an immediately succeeding processing period after the first processing period.
    Type: Grant
    Filed: August 30, 2012
    Date of Patent: May 2, 2017
    Assignee: Yamaha Corporation
    Inventor: Hiraku Kayama
  • Patent number: 9552806
    Abstract: A sound synthesizing apparatus includes a processor coupled to a memory. The processor configured to execute computer-executable units comprising: an information acquirer adapted to acquire synthesis information which specifies a duration and an utterance content for each unit sound; a prolongation setter adapted to set whether prolongation is permitted or inhibited for each of a plurality of phonemes corresponding to the utterance content of the each unit sound; and a sound synthesizer adapted to generate a synthesized sound corresponding to the synthesis information by connecting a plurality of sound fragments corresponding to the utterance content of the each unit sound. The sound synthesizer prolongs a sound fragment corresponding to the phoneme the prolongation of which is permitted in accordance with the duration of the unit sound.
    Type: Grant
    Filed: February 26, 2013
    Date of Patent: January 24, 2017
    Assignee: Yamaha Corporation
    Inventors: Hiraku Kayama, Motoki Ogasawara
  • Patent number: 9484013
    Abstract: A speech simulation system adapted for a user to communicate with others. The system has at least one sensor to sense controlled and coordinated body movement. The system has a computer processor connected to the at least one sensor. The system has a database memory connected to the computer processor. The system has software programming to operate the computer processor. The system has a feedback device connected to the computer processor and directed to the user. The system has an outward audio output device connected to the computer processor to provide sound and a speaker connected to the outward audio output device.
    Type: Grant
    Filed: February 19, 2013
    Date of Patent: November 1, 2016
    Inventor: Mary Elizabeth McCulloch
  • Patent number: 9368126
    Abstract: A method, system and computer readable storage medium for assessing speech prosody. The method includes the steps of: receiving input speech data; acquiring a prosody constraint; assessing prosody of the input speech data according to the prosody constraint; and providing assessment result where at least of the steps is carried out using a computer device.
    Type: Grant
    Filed: April 29, 2011
    Date of Patent: June 14, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Yong Qin, Qin Shi, Zhiwei Shuang, Shi Lei Zhang
  • Patent number: 9218798
    Abstract: Provided is a voice assist device in an electronic musical instrument in which tone selection or a sound setting corresponding to a key is performed in advance by pressing the operation button 1 while pressing one of the keys in the keyboard 2, including a changed state recognizing unit 3 that recognizes from a pressed key a changed state of tone selection or a sound setting determined corresponding to the key in advance, a setting item name storing unit 4 that stores a setting item name of the tone selection or sound setting as voice data, and a sound emitting unit 5 that emits a setting item name corresponding to the changed state, and the changed state recognizing unit 3 includes a voice assist recognizing unit 6 that detects a depression for a preset time or more of the operation button 1 prior to a depression of the key.
    Type: Grant
    Filed: August 5, 2015
    Date of Patent: December 22, 2015
    Assignee: KAWAI MUSICAL INSTRUMENTS MANUFACTURING CO., LTD.
    Inventors: Takuya Satoh, Kohtaro Ilimura, Sachie Ilimura
  • Patent number: 9135911
    Abstract: A system, method and program for acquiring from an input text a character string set and generating the pronunciation thereof which should be recognized as a word is disclosed.
    Type: Grant
    Filed: September 26, 2014
    Date of Patent: September 15, 2015
    Assignees: NEXGEN FLIGHT LLC, DOINITA DIANE SERBAN
    Inventors: Doinita Serban, Bhupat Raigaga
  • Patent number: 8983842
    Abstract: There is provided a speech processing apparatus including: a data obtaining unit which obtains music progression data defining a property of one or more time points or one or more time periods along progression of music; a determining unit which determines an output time point at which a speech is to be output during reproducing the music by utilizing the music progression data obtained by the data obtaining unit; and an audio output unit which outputs the speech at the output time point determined by the determining unit during reproducing the music.
    Type: Grant
    Filed: August 12, 2010
    Date of Patent: March 17, 2015
    Assignee: Sony Corporation
    Inventors: Tetsuo Ikeda, Ken Miyashita, Tatsushi Nashida
  • Patent number: 8977550
    Abstract: Part units of speech information are arranged in a predetermined order to generate a sentence unit of a speech information set. To each of a plurality of speech part units of the speech information, an attribute of “interrupt possible after reproduction” with which reproduction of priority interrupt information can be started after the speech part unit of the speech information is reproduced or another attribute of “interrupt impossible after reproduction” with which reproduction of the priority interrupt information cannot be started even after the speech part unit of the speech information is reproduced is set. When the priority interrupt information having a high priority rank than the speech information set being currently reproduced is inputted, if the attribute of the speech information being reproduced at the point in time is “interrupt impossible after reproduction,” then the priority interrupt information is reproduced after the speech information is reproduced.
    Type: Grant
    Filed: May 6, 2011
    Date of Patent: March 10, 2015
    Assignee: Honda Motor Co., Ltd.
    Inventor: Tokujiro Kizaki
  • Patent number: 8977551
    Abstract: The present invention provides a parametric speech synthesis method and a parametric speech synthesis system.
    Type: Grant
    Filed: October 27, 2011
    Date of Patent: March 10, 2015
    Assignee: Goertek Inc.
    Inventors: Fengliang Wu, Zhenhua Wu
  • Patent number: 8977552
    Abstract: A system, method and computer readable medium that enhances a speech database for speech synthesis is disclosed. The method may include labeling audio files in a primary speech database, identifying segments in the labeled audio files that have varying pronunciations based on language differences, identifying replacement segments in a secondary speech database, enhancing the primary speech database by substituting the identified secondary speech database segments for the corresponding identified segments in the primary speech database, and storing the enhanced primary speech database for use in speech synthesis.
    Type: Grant
    Filed: May 28, 2014
    Date of Patent: March 10, 2015
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Alistair D. Conkie, Ann K. Syrdal
  • Publication number: 20150025892
    Abstract: A system and method for speech-to-singing synthesis is provided. The method includes deriving characteristics of a singing voice for a first individual and modifying vocal characteristics of a voice for a second individual in response to the characteristics of the singing voice of the first individual to generate a synthesized singing voice for the second individual.
    Type: Application
    Filed: March 6, 2013
    Publication date: January 22, 2015
    Applicant: Agency for Science, Technology and Research
    Inventors: Siu Wa Lee, Ling Cen, Haizhou Li, Yaozhu Paul Chan, Minghui Dong
  • Patent number: 8909538
    Abstract: Improved methods of presenting speech prompts to a user as part of an automated system that employs speech recognition or other voice input are described. The invention improves the user interface by providing in combination with at least one user prompt seeking a voice response, an enhanced user keyword prompt intended to facilitate the user selecting a keyword to speak in response to the user prompt. The enhanced keyword prompts may be the same words as those a user can speak as a reply to the user prompt but presented using a different audio presentation method, e.g., speech rate, audio level, or speaker voice, than used for the user prompt. In some cases, the user keyword prompts are different words from the expected user response keywords, or portions of words, e.g., truncated versions of keywords.
    Type: Grant
    Filed: November 11, 2013
    Date of Patent: December 9, 2014
    Assignee: Verizon Patent and Licensing Inc.
    Inventor: James Mark Kondziela
  • Patent number: 8898057
    Abstract: Disclosed is an encoding apparatus that can efficiently encode a signal that is a broad or extra-broad band signal or the like, thereby improving the quality of a decoded signal. This encoding apparatus includes a band establishing unit (301) that generate, based on the characteristic of the input signal, band establishment information to be used for dividing the band of the input signal to establish a first band part of lower frequency side and a second band part of higher frequency side; a lower frequency encoding unit (302) for encoding, based on the band establishment information, the input signal of the first band part to generate encoded lower frequency part information; and a higher frequency encoding unit (303) for encoding, based on the band establishment information, the input signal of the second band part to generate encoded higher frequency part information.
    Type: Grant
    Filed: October 22, 2010
    Date of Patent: November 25, 2014
    Assignee: Panasonic Intellectual Property Corporation of America
    Inventor: Tomofumi Yamanashi
  • Patent number: 8888494
    Abstract: One or more embodiments present a script to a user in an interactive script environment. A digital representation of a manuscript is analyzed. This digital representation includes a set of roles and a set of information associated with each role in the set of roles. An active role in the set of roles that is associated with a given user is identified based on the analyzing. At least a portion of the manuscript is presented to the given user via a user interface. The portion includes at least a subset of information in the set of information. Information within the set of information that is associated with the active role is presented in a visually different manner than information within the set of information that is associated with a non-active role, which is a role that is associated with a user other than the given user.
    Type: Grant
    Filed: June 27, 2011
    Date of Patent: November 18, 2014
    Inventor: Randall Lee Threewits
  • Patent number: 8868431
    Abstract: A recognition dictionary creation device identifies the language of a reading of an inputted text which is a target to be registered and adds a reading with phonemes in the language identified thereby to the target text to be registered, and also converts the reading of the target text to be registered from the phonemes in the language identified thereby to phonemes in a language to be recognized which is handled in voice recognition to create a recognition dictionary in which the converted reading of the target text to be registered is registered.
    Type: Grant
    Filed: February 5, 2010
    Date of Patent: October 21, 2014
    Assignee: Mitsubishi Electric Corporation
    Inventors: Michihiro Yamazaki, Jun Ishii, Yasushi Ishikawa
  • Patent number: 8862477
    Abstract: A method and a processing device for managing an interactive speech recognition system is provided. Whether a voice input relates to expected input, at least partially, of any one of a group of menus different from a current menu is determined. If the voice input relates to the expected input, at least partially, of any one of the group of menus different from the current menu, skipping to the one of the group of menus is performed. The group of menus is different from the current menu include menus at multiple hierarchical levels.
    Type: Grant
    Filed: June 3, 2013
    Date of Patent: October 14, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Hary E. Blanchard
  • Patent number: 8856008
    Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.
    Type: Grant
    Filed: September 18, 2013
    Date of Patent: October 7, 2014
    Assignee: Morphism LLC
    Inventor: James H. Stephens, Jr.
  • Patent number: 8838441
    Abstract: A representation of an audio signal having a first, a second and a third frame is derived by estimating first warp information for the first and second frames and second warp information for the second and third frames, the warp information describing pitch information of the audio signal. First or second spectral coefficients for first and second frames or second and third frames are derived using first or second warp information and a first or second weighted representation of the first and second frames or second and third frames, the first or second weighted representation derived by applying a first or second window function to the first and second frames or second and third frames, wherein the first or second window function depends on the first or second warp information. The representation of the audio signal is generated including the first and the second spectral coefficients.
    Type: Grant
    Filed: February 14, 2013
    Date of Patent: September 16, 2014
    Assignee: Dolby International AB
    Inventor: Lars Villemoes
  • Publication number: 20140236586
    Abstract: An method and apparatus that modifies static media, such as music files being played to a user of the device, upon the generation or receipt of an alert, notification or message, so that information in the alert, notification or message can be incorporated into the media files then communicated to the user. In a further embodiment, a user's response to the communicated information can be sensed using one or more sensors and transducers so as to provide feedback to the device, and then optionally to a node in a system.
    Type: Application
    Filed: February 18, 2013
    Publication date: August 21, 2014
    Applicant: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)
    Inventor: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)
  • Patent number: 8775185
    Abstract: A method for converting translating text into speech with a speech sample library is provided. The method comprises converting translating an input text to a sequence of triphones; determining musical parameters of each phoneme in the sequence of triphones; detecting, in the speech sample library, speech segments having at least the determined musical parameters; and concatenating the detected speech segments.
    Type: Grant
    Filed: November 27, 2012
    Date of Patent: July 8, 2014
    Inventors: Gershon Silbert, Andres Hakim
  • Publication number: 20140180696
    Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal.
    Type: Application
    Filed: February 25, 2014
    Publication date: June 26, 2014
    Applicant: BlackBerry Limited
    Inventor: Tadashi YAMAURA
  • Patent number: 8751237
    Abstract: A sound control section (114) selects and outputs a text-to-speech item from items included in program information multiplexed with a broadcast signal; and starts or stops outputting the text-to-speech item, based on request from a remote controller control section (113). A sound generation section (115) converts the text-to-speech item to a sound signal. A speaker (109) reproduces the sound signal. The sound control section (114) compares each item of information about a program currently selected by user's operation of the remote controller, with each item of information about the previous program selected just before the user's operation. If an item of the currently selected program information is the same as the corresponding item of the operation-prior program information, and text-to-speech processing has been already completed for the item after the last change in the item, the sound control section (114) stops outputting the item to the sound generation section (115).
    Type: Grant
    Filed: February 23, 2011
    Date of Patent: June 10, 2014
    Assignee: Panasonic Corporation
    Inventor: Koumei Kubota
  • Patent number: 8744841
    Abstract: An adaptive time/frequency-based encoding mode determination apparatus including a time domain feature extraction unit to generate a time domain feature by analysis of a time domain signal of an input audio signal, a frequency domain feature extraction unit to generate a frequency domain feature corresponding to each frequency band generated by division of a frequency domain corresponding to a frame of the input audio signal into a plurality of frequency domains, by analysis of a frequency domain signal of the input audio signal, and a mode determination unit to determine any one of a time-based encoding mode and a frequency-based encoding mode, with respect to the each frequency band, by use of the time domain feature and the frequency domain feature.
    Type: Grant
    Filed: September 21, 2006
    Date of Patent: June 3, 2014
    Assignee: SAMSUNG Electronics Co., Ltd.
    Inventors: Eun Mi Oh, Ki Hyun Choo, Jung-Hoe Kim, Chang Yong Son
  • Patent number: 8744851
    Abstract: A system, method and computer readable medium that enhances a speech database for speech synthesis is disclosed. The method may include labeling audio files in a primary speech database, identifying segments in the labeled audio files that have varying pronunciations based on language differences, identifying replacement segments in a secondary speech database, enhancing the primary speech database by substituting the identified secondary speech database segments for the corresponding identified segments in the primary speech database, and storing the enhanced primary speech database for use in speech synthesis.
    Type: Grant
    Filed: August 13, 2013
    Date of Patent: June 3, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Alistair Conkie, Ann K Syrdal
  • Patent number: 8731933
    Abstract: A speech synthesizing apparatus includes a selector configured to select a plurality of speech units for synthesizing a speech of a phoneme sequence by referring to speech unit information stored in an information memory. Speech unit waveforms corresponding to the speech units are acquired from a plurality of speech unit waveforms stored in a waveform memory, and the speech is synthesized by utilizing the speech unit waveforms acquired. When acquiring the speech unit waveforms, at least two speech unit waveforms from a continuous region of the waveform memory are copied onto a buffer by one access, wherein a data quantity of the at least two speech unit waveforms is less than or equal to a size of the buffer.
    Type: Grant
    Filed: April 10, 2013
    Date of Patent: May 20, 2014
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Takehiko Kagoshima
  • Patent number: 8719030
    Abstract: The present invention is a method and system to convert speech signal into a parametric representation in terms of timbre vectors, and to recover the speech signal thereof. The speech signal is first segmented into non-overlapping frames using the glottal closure instant information, each frame is converted into an amplitude spectrum using a Fourier analyzer, and then using Laguerre functions to generate a set of coefficients which constitute a timbre vector. A sequence of timbre vectors can be subject to a variety of manipulations. The new timbre vectors are converted back into voice signals by first transforming into amplitude spectra using Laguerre functions, then generating phase spectra from the amplitude spectra using Kramers-Knonig relations. A Fourier transformer converts the amplitude spectra and phase spectra into elementary acoustic waves, then superposed to become the output voice. The method and system can be used for voice transformation, speech synthesis, and automatic speech recognition.
    Type: Grant
    Filed: December 3, 2012
    Date of Patent: May 6, 2014
    Inventor: Chengjun Julian Chen
  • Patent number: 8706493
    Abstract: In one embodiment of a controllable prosody re-estimation system, a TTS/STS engine consists of a prosody prediction/estimation module, a prosody re-estimation module and a speech synthesis module. The prosody prediction/estimation module generates predicted or estimated prosody information. And then the prosody re-estimation module re-estimates the predicted or estimated prosody information and produces new prosody information, according to a set of controllable parameters provided by a controllable prosody parameter interface. The new prosody information is provided to the speech synthesis module to produce a synthesized speech.
    Type: Grant
    Filed: July 11, 2011
    Date of Patent: April 22, 2014
    Assignee: Industrial Technology Research Institute
    Inventors: Cheng-Yuan Lin, Chien-Hung Huang, Chih-Chung Kuo
  • Patent number: 8706497
    Abstract: A synthesis filter 106 synthesizes a plurality of wide-band speech signals by combining wide-band phoneme signals and sound source signals from a speech signal code book 105, and a distortion evaluation unit 107 selects one of the wide-band speech signals with a minimum waveform distortion with respect to an up-sampled narrow-band speech signal output from a sampling conversion unit 101. A first bandpass filter 103 extracts a frequency component outside a narrow-band of the wide-band speech signal and a band synthesis unit 104 combines it with the up-sampled narrow-band speech signal.
    Type: Grant
    Filed: October 22, 2010
    Date of Patent: April 22, 2014
    Assignee: Mitsubishi Electric Corporation
    Inventors: Satoru Furuta, Hirohisa Tasaki
  • Patent number: 8700388
    Abstract: A processed representation of an audio signal having a sequence of frames is generated by sampling the audio signal within first and second frames of the sequence of frames, the second frame following the first frame, the sampling using information on a pitch contour of the first and second frames to derive a first sampled representation. The audio signal is sampled within the second and third frames, the third frame following the second frame in the sequence of frames. The sampling uses the information on the pitch contour of the second frame and information on a pitch contour of the third frame to derive a second sampled representation. A first scaling window is derived for the first sampled representation, and a second scaling window is derived for the second sampled representation, the scaling windows depending on the samplings applied to derive the first sampled representations or the second sampled representation.
    Type: Grant
    Filed: March 23, 2009
    Date of Patent: April 15, 2014
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Bernd Edler, Sascha Disch, Ralf Geiger, Stefan Bayer, Ulrich Kraemer, Guillaume Fuchs, Max Neuendorf, Markus Multrus, Gerald Schuller, Harald Popp
  • Patent number: 8655156
    Abstract: In one method embodiment, providing a multiplex of compressed versions of a first video stream and a first audio stream, each corresponding to an audiovisual (A/V) program, the first video stream and the first audio stream each corresponding to a first playout rate and un-synchronized with each other for an initial playout portion; and providing a compressed version of a second audio stream, the second audio stream corresponding to a pitch-preserving, second playout rate different than the first playout rate, the second audio stream synchronized to the initial playout portion of the first video stream when the first video stream is played out at the second playout rate, the first audio stream replaceable by the second audio stream for the initial playout portion.
    Type: Grant
    Filed: March 2, 2010
    Date of Patent: February 18, 2014
    Assignee: Cisco Technology, Inc.
    Inventors: Ali C. Begen, Tankut Akgul, Michael A. Ramalho, David R. Oran, William C. Ver Steeg
  • Patent number: 8630857
    Abstract: Disclosed is a speech synthesizing apparatus including a segment selection unit that selects a segment suited to a target segment environment from candidate segments, includes a prosody change amount calculation unit that calculates prosody change amount of each candidate segment based on prosody information of candidate segments and the target segment environment, a selection criterion calculation unit that calculates a selection criterion based on the prosody change amount, a candidate selection unit that narrows down selection candidates based on the prosody change amount and the selection criterion, and an optimum segment search unit than searches for an optimum segment from among the narrowed-down candidate segments.
    Type: Grant
    Filed: February 15, 2008
    Date of Patent: January 14, 2014
    Assignee: NEC Corporation
    Inventors: Masanori Kato, Reishi Kondo, Yasuyuki Mitsui