Synthesis Patents (Class 704/258)
  • Patent number: 9792917
    Abstract: The present invention relates to an audio processing device and method, and an electro-acoustic converting device and method. The audio processing device comprises: a first receiving module, configured to receive a first audio signal; a second receiving module, configured to receive a second audio signal; an audio synthesizing module, configured to synthesize the first audio signal and the second audio signal to obtain a third audio signal; and an audio outputting module, configured to output the third audio signal. According to the present invention, when enjoying songs or music or learning languages by using an audio processing device or an audio playing device, a user is capable of hearing his or her own voice while singing or reading, which greatly improves the effects of self-entertainment and language learning.
    Type: Grant
    Filed: May 8, 2014
    Date of Patent: October 17, 2017
    Assignee: KT MICRO, INC.
    Inventors: Dianyu Chen, Yihai Xiang, Haiqing Lin, Pan Mu, Yanqing Wu, Hekai Kang, Dongfeng Zhou, Yuqiang Yuan, Wenhui Yuan, Jinfeng Wu
  • Patent number: 9785441
    Abstract: A computer processor that operates on distinct first and second instruction streams that have a predefined timed semantic relationship. At least one of the first and second instruction streams includes variable-length instructions having a header and associated bundle bounded by a head end and a tail end. An alignment hole within the bundle encodes information representing at least one nop operation. The computer processor includes first and second multi-stage instruction processing components configured to process in parallel the first and second instruction streams. At least one of the first and second multi-stage instruction processing components includes an instruction buffer operably coupled to a decode stage. The decode stage is configured to process a variable-length instruction by isolating and interpreting the alignment hole of the variable length instruction in order to initiate zero or more nop operations that follow the timed semantic relationship between the first and second instruction streams.
    Type: Grant
    Filed: May 29, 2014
    Date of Patent: October 10, 2017
    Assignee: Mill Computing, Inc.
    Inventors: Roger Rawson Godard, Arthur David Kahlich, David Arthur Yost
  • Patent number: 9747892
    Abstract: Speech is modeled as a cognitively-driven sensory-motor activity where the form of speech is the result of categorization processes that any given subject recreates by focusing on creating sound patterns that are represented by syllables. These syllables are then combined in characteristic patterns to form words, which are in turn, combined in characteristic patterns to form utterances. A speech recognition process first identifies syllables in an electronic waveform representing ongoing speech. The pattern of syllables is then deconstructed into a standard form that is used to identify words. The words are then concatenated to identify an utterance. Similarly, a speech synthesis process converts written words into patterns of syllables. The pattern of syllables is then processed to produce the characteristic rhythmic sound of naturally spoken words. The words are then assembled into an utterance which is also processed to produce a natural sounding speech.
    Type: Grant
    Filed: October 3, 2016
    Date of Patent: August 29, 2017
    Inventor: Boris Fridman-Mintz
  • Patent number: 9741012
    Abstract: Embodiments include a computer-implemented management platform for securely generating tracking codes, and for verifiably imprinting those tracking codes onto physical articles. In an embodiment, one or more hardware processors generate tracking code(s) and send the tracking code(s) towards an automated computer-controlled production line, and which physically imprint each tracking codes onto a corresponding article, and physically verify the physical imprinting. If a tracking code was correctly imprinted on its corresponding article, one or more records are recorded in a durable storage medium, which indicate that the tracking code imprinted on an article. If a tracking code was incorrectly imprinted on its corresponding article, the factory line physically rejects the corresponding article. Embodiments also include the computer-implemented management platform securely managing those physical articles throughout their lifecycle, based on the securely-generated and verifiably-imprinted tracking codes.
    Type: Grant
    Filed: November 4, 2015
    Date of Patent: August 22, 2017
    Assignee: HURU SYSTEMS LTD.
    Inventors: Ian A. Nazzari, Paul Eipper
  • Patent number: 9734817
    Abstract: To prioritize the processing text-to-speech (TTS) tasks, a TTS system may determine, for each task, an amount of time prior to the task reaching underrun, that is the time before the synthesized speech output to a user catches up to the time since a TTS task was originated. The TTS system may also prioritize tasks to reduce the amount of time between when a user submits a TTS request and when results are delivered to the user. When prioritizing tasks, such as allocating resources to existing tasks or accepting new tasks, the TTS system may prioritize tasks with the lowest amount of time prior to underrun and/or tasks with the longest time prior to delivery of first results.
    Type: Grant
    Filed: March 21, 2014
    Date of Patent: August 15, 2017
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventor: Bartosz Putrycz
  • Patent number: 9721563
    Abstract: A speech recognition system uses, in one embodiment, an extended phonetic dictionary that is obtained by processing words in a user's set of databases, such as a user's contacts database, with a set of pronunciation guessers. The speech recognition system can use a conventional phonetic dictionary and the extended phonetic dictionary to recognize speech inputs that are user requests to use the contacts database, for example, to make a phone call, etc. The extended phonetic dictionary can be updated in response to changes in the contacts database, and the set of pronunciation guessers can include pronunciation guessers for a plurality of locales, each locale having its own pronunciation guesser.
    Type: Grant
    Filed: June 8, 2012
    Date of Patent: August 1, 2017
    Assignee: Apple Inc.
    Inventor: Devang K. Naik
  • Patent number: 9697818
    Abstract: A method and apparatus that dynamically adjust operational parameters of a text-to-speech engine in a speech-based system are disclosed. A voice engine or other application of a device provides a mechanism to alter the adjustable operational parameters of the text-to-speech engine. In response to one or more environmental conditions, the adjustable operational parameters of the text-to-speech engine are modified to increase the intelligibility of synthesized speech.
    Type: Grant
    Filed: December 5, 2014
    Date of Patent: July 4, 2017
    Assignee: Vocollect, Inc.
    Inventors: James Hendrickson, Debra Drylie Stiffey, Duane Littleton, John Pecorari, Arkadiusz Slusarczyk
  • Patent number: 9697820
    Abstract: Systems and processes for performing unit-selection text-to-speech synthesis are provided. In one example process, a sequence of target units can represent a spoken pronunciation of text. A set of predicted acoustic model parameters of a second target unit can be determined using a set of acoustic features of a first candidate speech segment of a first target unit and a set of linguistic features of the second target unit. A likelihood score of the second candidate speech segment with respect to the first candidate speech segment can be determined using the set of predicted acoustic model parameters of the second target unit and a set of acoustic features of the second candidate speech segment of the second target unit. The second candidate speech segment can be selected for speech synthesis based on the determined likelihood score. Speech corresponding to the received text can be generated using the selected second candidate speech segment.
    Type: Grant
    Filed: December 7, 2015
    Date of Patent: July 4, 2017
    Assignee: Apple Inc.
    Inventor: Woojay Jeon
  • Patent number: 9697851
    Abstract: A note-taker terminal (200) and an information delivery device (100) are used. The information delivery device (100) includes a breathing detection unit (104) that specifies breathing sections from silent sections of uttered speech, a data processing unit (105) that determines, for every allocated time period of a note-taker, whether a breathing section exists in a range based on an end point of the allocated time period, and generates, if a breathing section exists, speech data of the utterance from a start point of the allocated time period until the breathing section, and, if a breathing section does not exist, speech data of the utterance from the start point until the end point of the allocated time period, and a data transmission unit (106) that transmits the speech data to the note-taker terminal (200). The note-taker terminal (200) receives the speech data, and transmits input text data to a user terminal (300) of a note-taking user.
    Type: Grant
    Filed: February 20, 2014
    Date of Patent: July 4, 2017
    Assignee: NEC Solution Innovators, Ltd.
    Inventor: Tomonari Nishimura
  • Patent number: 9639359
    Abstract: Embodiments are described for a method for compiling instruction code for execution in a processor having a number of functional units by determining a thermal constraint of the processor, and defining instruction words comprising both real instructions and one or more no operation (NOP) instructions to be executed by the functional units within a single clock cycle, wherein a number of NOP instructions executed over a number of consecutive clock cycles is configured to prevent exceeding the thermal constraint during execution of the instruction code.
    Type: Grant
    Filed: May 21, 2013
    Date of Patent: May 2, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Yuan Xie, Junli Gu
  • Patent number: 9641481
    Abstract: The disclosure proposes a smart conversation method and an electronic device using the same method. According to one of the exemplary embodiments, an electronic device may receive via a receiver a first communication in a first communication type and determining a recipient status. The electronic device may determine a second communication type as an optimal communication type based on the recipient status. The electronic device may convert the first communication into a second communication that is suitable for the second communication type. The electronic device may transmit via a transmitter the second communication in the second communication type.
    Type: Grant
    Filed: January 30, 2015
    Date of Patent: May 2, 2017
    Assignee: HTC Corporation
    Inventor: Wen-Ping Ying
  • Patent number: 9628603
    Abstract: For voice mail transcription, a method is disclosed that includes detecting a communication device communicating an audio signal from the communication device to a voicemail system and transmitting data selected from the group consisting of text message data generated from the audio signal and voice training data to the voicemail system.
    Type: Grant
    Filed: July 23, 2014
    Date of Patent: April 18, 2017
    Assignee: Lenovo (Singapore) PTE. LTD.
    Inventors: Nathan J. Peterson, John Carl Mese, Russell Speight VanBlon, Rod D. Waltermann
  • Patent number: 9626955
    Abstract: Techniques for improved text-to-speech processing are disclosed. The improved text-to-speech processing can convert text from an electronic document into an audio output that includes speech associated with the text as well as audio contextual cues. One aspect provides audio contextual cues to the listener when outputting speech (spoken text) pertaining to a document. The audio contextual cues can be based on an analysis of a document prior to a text-to-speech conversion. Another aspect can produce an audio summary for a file. The audio summary for a document can thereafter be presented to a user so that the user can hear a summary of the document without having to process the document to produce its spoken text via text-to-speech conversion.
    Type: Grant
    Filed: April 4, 2016
    Date of Patent: April 18, 2017
    Assignee: Apple Inc.
    Inventors: Christopher Brian Fleizach, Reginald Dean Hudson
  • Patent number: 9570066
    Abstract: A method of speech synthesis including receiving a text input sent by a sender, processing the text input responsive to at least one distinguishing characteristic of the sender to produce synthesized speech that is representative of a voice of the sender, and communicating the synthesized speech to a recipient user of the system.
    Type: Grant
    Filed: July 16, 2012
    Date of Patent: February 14, 2017
    Assignee: General Motors LLC
    Inventors: Gaurav Talwar, Xufang Zhao, Ron M. Hecht
  • Patent number: 9570055
    Abstract: A method for converting textual messages to musical messages comprising receiving a text input and receiving a musical input selection. The method includes analyzing the text input to determine text characteristics and analyzing a musical input corresponding to the musical input selection to determine musical characteristics. Based on the text characteristic and the musical characteristic, the method includes correlating the text input with the musical input to generate a synthesizer input, and sending the synthesizer input to a voice synthesizer. The method includes receiving a vocal rendering of the text input from the voice synthesizer, generating a musical message from the vocal rendering and the musical input, and outputting the musical message.
    Type: Grant
    Filed: August 24, 2015
    Date of Patent: February 14, 2017
    Assignee: ZYA, INC.
    Inventors: Matthew Michael Serletic, II, Bo Bazylevsky, James Mitchell, Ricky Kovac, Patrick Woodward, Thomas Webb, Ryan Groves
  • Patent number: 9564121
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for unit selection synthesis. The method causes a computing device to add a supplemental phoneset to a speech synthesizer front end having an existing phoneset, modify a unit preselection process based on the supplemental phoneset, preselect units from the supplemental phoneset and the existing phoneset based on the modified unit preselection process, and generate speech based on the preselected units. The supplemental phoneset can be a variation of the existing phoneset, can include a word boundary feature, can include a cluster feature where initial consonant clusters and some word boundaries are marked with diacritics, can include a function word feature which marks units as originating from a function word or a content word, and/or can include a pre-vocalic or post-vocalic feature. The speech synthesizer front end can incorporates the supplemental phoneset as an extra feature.
    Type: Grant
    Filed: August 7, 2014
    Date of Patent: February 7, 2017
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Alistair D. Conkie, Mark Beutnagel, Yeon-Jun Kim, Ann K. Syrdal
  • Patent number: 9558201
    Abstract: Techniques are provided for de-duplication of data. In one embodiment, a system comprises de-duplication logic that is coupled to a de-duplication repository. The de-duplication logic is operable to receive, from a client device over a network, a request to store a file in the de-duplicated repository using a single storage encoding. The request includes a file identifier and a set of signatures that identify a set of chunks from the file. The de-duplication logic determines whether any chunks in the set are missing from the de-duplicated repository and requests the missing chunks from the client device. Then, for each missing chunk, the de-duplication logic stores in the de-duplicated repository that chunk and a signature representing that chunk. The de-duplication logic also stores, in the de-duplicated repository, a file entry that represents the file and that associates the set of signatures with the file identifier.
    Type: Grant
    Filed: January 7, 2014
    Date of Patent: January 31, 2017
    Assignee: VMware, Inc.
    Inventors: Israel Zvi Ben-Shaul, Leonid Vasetsky
  • Patent number: 9542925
    Abstract: The invention relates to a method for generating sound for a rotating machine, including a step (E1) of determining the frequencies and amplitudes of n partials and/or harmonics (i) pertaining to the sound of a rotating machine, characterized in that the method includes a step (E2) of determining values (a1) and a step (E7-E8) of calculating a synthetic sound for the rotating machine, said synthetic sound being composed from the n partials and/or harmonics (i), while the frequency thereof is entirely or partially shifted by the values (ai).
    Type: Grant
    Filed: April 4, 2012
    Date of Patent: January 10, 2017
    Assignees: RENAULT SAS, GENESIS
    Inventors: Nathalie Le-Hir, Gaël Guyader, Patrick Boussard, Benoît Gauduin, Florent Jaillet
  • Patent number: 9532156
    Abstract: A non-transitory computer readable storage medium with instructions executable by a processor identify a center component, a side component and an ambient component within right and left channels of a digital audio input signal. A spatial ratio is determined from the center component and side component. The digital audio input signal is adjusted based upon the spatial ratio to form a pre-processed signal. Recursive crosstalk cancellation processing is performed on the pre-processed signal to form a crosstalk cancelled. The center component of the crosstalk cancelled signal is realigned to create the final digital audio output.
    Type: Grant
    Filed: December 12, 2014
    Date of Patent: December 27, 2016
    Assignee: AMBIDIO, INC.
    Inventor: Tsai-Yi Wu
  • Patent number: 9483728
    Abstract: A method for training a deep neural network (DNN), comprises receiving and formatting speech data for the training, performing Hessian-free sequence training (HFST) on a first subset of a plurality of subsets of the speech data, and iteratively performing the HFST on successive subsets of the plurality of subsets of the speech data, wherein iteratively performing the HFST comprises reusing information from at least one previous iteration.
    Type: Grant
    Filed: October 30, 2014
    Date of Patent: November 1, 2016
    Assignee: International Business Machines Corporation
    Inventors: Pierre Dognin, Vaibhava Goel
  • Patent number: 9483953
    Abstract: A voice learning apparatus includes a learning-material voice storage unit that stores learning material voice data including example sentence voice data; a learning text storage unit that stores a learning material text including an example sentence text; a learning-material text display controller that displays the learning material text; a learning-material voice output controller that performs voice output based on the learning material voice data; an example sentence specifying unit that specifies the example sentence text during the voice output; an example-sentence voice output controller that performs voice output based on the example sentence voice data associated with the specified example sentence text; and a learning-material voice output restart unit that restarts the voice output from a position where the voice output is stopped last time, after the voice output is performed based on the example sentence voice data.
    Type: Grant
    Filed: August 7, 2012
    Date of Patent: November 1, 2016
    Assignee: CASIO COMPUTER CO., LTD.
    Inventor: Daisuke Nakajima
  • Patent number: 9471901
    Abstract: Mechanisms are provided for representing white space in a graphical representation of a data model. These mechanisms involve analyzing output data that is to be output to a user via an output device, to identify white spaces in the output data. White spaces comprise portions of a range of metrics of output data values where the output data does not have data objects representing those portions of the range of metrics of output data. For each identified white space, a white space data object is created. The white space data objects are provided to an application which performs an operation on the white space data objects to output the white space data objects in a manner that identifies the white space data objects differently from non-white space data objects in the output data.
    Type: Grant
    Filed: September 12, 2011
    Date of Patent: October 18, 2016
    Assignee: International Business Machines Corporation
    Inventors: Brian J. Cragun, Mary J. Mueller, James S. Taylor
  • Patent number: 9466285
    Abstract: A method of deriving speech synthesis parameters from an input speech audio signal, wherein the audio signal is segmented on the basis of estimated positions of glottal closure incidents and the resulting segments are processed to obtain the complex cepstrum used to derive a synthesis filter. A reconstructed speech signal is produced by passing a pulsed excitation signal derived from the position of the glottal closure incidents through the synthesis filter, and compared with the input speech audio signal. The pulse excitation signal and the complex cepstrum are then iteratively modified to minimize the difference between the reconstructed speech signal and the input speech audio signal, by optimizing the position of the pulses in the excitation signal to reduce the mean squared error between the reconstructed speech signal and the input speech audio signal, and recalculating the complex using the optimized pulse positions.
    Type: Grant
    Filed: November 26, 2013
    Date of Patent: October 11, 2016
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Ranniery Maia
  • Patent number: 9460708
    Abstract: The described implementations relate to automated data cleanup. One system includes a language model generated from language model seed text and a dictionary of possible data substitutions. This system also includes a transducer configured to cleanse a corpus utilizing the language model and the dictionary. The transducer can process speech recognition data in some cases by substituting a second word for a first word which shares pronunciation with the first word but is spelled differently. In some cases, this can be accomplished by establishing corresponding probabilities of the first word and second word based on a third word that appears in sequence with the first word.
    Type: Grant
    Filed: September 17, 2009
    Date of Patent: October 4, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Geoffrey Zweig, Yun-Cheng Ju
  • Patent number: 9454597
    Abstract: A document management & retrieval system is configured to: store, for each word in a set of words, appearance positions of the each word in a set of documents as a word index; store, for each tag in a set of tags attached to words, a set of words that appear to a right and left of the each tag, and also store, as a tag LR index, appearance positions of the each tag in a set of documents with a combination of the each tag and a word appearing to its right or a combination of the each tag and a word appearing to its left as a key; and, in a tag search where a query phrase contains words and a tag next to each other, refer to the index with a tag and the word to the right or left of the tag as a key, thereby reducing the size of a document list to be read without needing to have a tag name as a secondary key. A tag is updated by just updating two places in the tag LR index.
    Type: Grant
    Filed: November 6, 2008
    Date of Patent: September 27, 2016
    Assignee: NEC Corporation
    Inventors: Yukitaka Kusumura, Toshiyuki Kamiya
  • Patent number: 9456273
    Abstract: An audio mixing method, apparatus and system, which can ensure sound quality after audio mixing and reduce consumption of computing resources. The method includes: receiving an audio stream of each site, and analyzing the audio stream of each site to obtain a sound characteristic value of a sound source object; selecting, according to a descending sequence of sound characteristic values of sound source objects, a predetermined number of sound source objects from the sound source objects to serve as main sound source objects; determining, according to a relationship between a target site and the sites where the main sound source objects are located, audio streams that require audio mixing for the target site; and performing audio mixing on the audio streams that require audio mixing for the target site and sending the audio streams after the audio mixing to the target site.
    Type: Grant
    Filed: March 26, 2014
    Date of Patent: September 27, 2016
    Assignee: Huawei Device Co., Ltd.
    Inventors: Dongqi Wang, Wuzhou Zhan
  • Patent number: 9454958
    Abstract: Technologies pertaining to training a deep neural network (DNN) for use in a recognition system are described herein. The DNN is trained using heterogeneous data, the heterogeneous data including narrowband signals and wideband signals. The DNN, subsequent to being trained, receives an input signal that can be either a wideband signal or narrowband signal. The DNN estimates the class posterior probability of the input signal regardless of whether the input signal is the wideband signal or the narrowband signal.
    Type: Grant
    Filed: March 7, 2013
    Date of Patent: September 27, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jinyu Li, Dong Yu, Yifan Gong
  • Patent number: 9449523
    Abstract: A narration session between a plurality of participants can be set up to allow participants to collaboratively narrate an electronic book. Information can be transmitted to each participant so that the views of the participants remain in sync. Visual cues can also be transmitted to notify a participant of text that is to read aloud and audio snippets of read text are collected to form a narration file. Participants without access rights to the electronic book can be granted temporary rights.
    Type: Grant
    Filed: June 27, 2012
    Date of Patent: September 20, 2016
    Assignee: Apple Inc.
    Inventors: Casey Maureen Dougherty, Gregory Robbin, Melissa Breglio Hajj
  • Patent number: 9418642
    Abstract: Systems, including methods and apparatus, for generating audio effects based on accompaniment audio produced by live or pre-recorded accompaniment instruments, in combination with melody audio produced by a singer. Audible broadcast of the accompaniment audio may be delayed by a predetermined time, such as the time required to determine chord information contained in the accompaniment signal. As a result, audio effects that require the chord information may be substantially synchronized with the audible broadcast of the accompaniment audio. The present teachings may be especially suitable for use in karaoke systems, to correct and add sound effects to a singer's voice that sings along with a pre-recorded accompaniment track.
    Type: Grant
    Filed: July 31, 2015
    Date of Patent: August 16, 2016
    Assignee: Sing Trix LLC
    Inventors: David Kenneth Hilderman, John Devecka
  • Patent number: 9412358
    Abstract: Systems, methods, and computer-readable storage devices for generating speech using a presentation style specific to a user, and in particular the user's social group. Systems configured according to this disclosure can then use the resulting, personalized, text and/or speech in a spoken dialogue or presentation system to communicate with the user. For example, a system practicing the disclosed method can receive speech from a user, identify the user, and respond to the received speech by applying a personalized natural language generation model. The personalized natural language generation model provides communications which can be specific to the identified user.
    Type: Grant
    Filed: May 13, 2014
    Date of Patent: August 9, 2016
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Taniya Mishra, Alistair D. Conkie, Svetlana Stoyanchev
  • Patent number: 9405742
    Abstract: A method for phonetizing a data list having text-containing list entries, each list entry in the data list being subdivided into at least two data fields for provision to a voice-controlled user interface, includes: converting a list entry from a text representation into phonetics; storing the phonetics as phonemes in a phonetized data list; inserting a separating character into the text of a list entry between the respective data fields of the list entry, concomitantly converting the inserted separating character into phonetics and concomitantly storing the converted separating character as a phoneme symbol; and storing the phonemes in a phonetic database, the phonetized data list being produced from the phonemes stored in the phonetic database.
    Type: Grant
    Filed: February 11, 2013
    Date of Patent: August 2, 2016
    Assignee: Continental Automotive GmbH
    Inventor: Jens Walther
  • Patent number: 9392390
    Abstract: A method of applying a combined control strategy for the reproduction of multichannel audio signals in two or more sound zones, the method comprising deriving a first cost function for controlling the acoustic potential energy, such as on the basis of the Acoustic Contrast Control method and/or the Energy Difference Maximation method, in the zones to obtain acoustic separation between the zones in terms of sound pressure, deriving a second cost function, such as the Pressure Matching method, controlling the phase of the sound provided in the zones, and where a weight is obtained for determining a combination of the first and second cost functions in a combined optimization.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: July 12, 2016
    Assignee: Bang & Olufsen A/S
    Inventors: Martin Olsen, Martin Bo Møller
  • Patent number: 9390728
    Abstract: A speech analysis apparatus is provided. An F0 extraction part extracts a pitch value from speech information. A spectrum extraction part extracts spectrum information from the speech information. An MVF extraction part extract a maximum voiced frequency and allows boundary information for respectively filtering a harmonic component and a non-harmonic component to be obtained. According to the speech analysis apparatus, speech synthesis apparatus, and speech analysis synthesis system of the present invention, speech that is closer to the original voice and is more natural may be synthesized. Also, speech may be represented with less data capacity.
    Type: Grant
    Filed: March 27, 2013
    Date of Patent: July 12, 2016
    Assignee: GWANGJU INSTITUTE OF SCIENCE AND TECHNOLOGY
    Inventors: Hong-Kook Kim, Kwang-Myung Jeon
  • Patent number: 9342506
    Abstract: Methods, systems and program product are disclosed for determining matching level of a text lookup segment with a plurality of source texts in a translation memory in terms of context. The invention determines exact matches for the lookup segment in the plurality of source texts, and determines, in the case that at least one exact match is determined, that a respective exact match is an in-context exact match for the lookup segment in the case that a context of the lookup segment matches that of the respective exact match. Degree of context matching required can be predetermined, and results prioritized. The invention also includes methods, systems and program products for storing a translation pair of source text and target text in a translation memory including context, and the translation memory so formed. The invention ensures that content is translated the same as previously translated content and reduces translator intervention.
    Type: Grant
    Filed: October 20, 2014
    Date of Patent: May 17, 2016
    Assignee: SDL Inc.
    Inventors: Russell G. Ross, Kevin Gillespie
  • Patent number: 9336782
    Abstract: Voice data may be collected by a plurality of voice donors and stored in a voice bank. A voice donor may authenticate to a voice collection system to start a session to provide voice data. During the voice collection session, the voice donor may be presented with a sequence of prompts to speak and voice data may be transferred to a server. The received voice data may be processed to determine the speech units spoken by the voice donor and a count of speech units received from the voice donor may be updated. Feedback may be provided to the voice donor indicating, for example, a progress of the voice collection, a quality level of the voice data, or information about speech unit counts. The voice bank may be used to create TTS voices for voice recipients, create a model of voice aging, or for other applications.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: May 10, 2016
    Assignee: VOCALID, INC.
    Inventor: Rupal Patel
  • Patent number: 9306687
    Abstract: Methods and systems according to the disclosure are for obtaining information for a music track on a radio broadcast and may include storing track information received from an information service in a data storage module. When an intermediary server receives a request from an end-user device for information for a particular music track playing on a particular radio broadcast signal, the server first checks the data storage module to determine if it has music track information for the particular music track. If so, the server provides that music track information to the end-user device. The intermediary server may be configured to automatically and preemptively request information from the music information service each time a new music track is played on a radio broadcast signal. The intermediary server may be configured to request information from the music information service the first time it receives a request for that particular music track.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: April 5, 2016
    Assignee: Imagination Technologies Limited
    Inventors: Nicholas H. Jurascheck, Ian Robert Knowles
  • Patent number: 9305543
    Abstract: Techniques for improved text-to-speech processing are disclosed. The improved text-to-speech processing can convert text from an electronic document into an audio output that includes speech associated with the text as well as audio contextual cues. One aspect provides audio contextual cues to the listener when outputting speech (spoken text) pertaining to a document. The audio contextual cues can be based on an analysis of a document prior to a text-to-speech conversion. Another aspect can produce an audio summary for a file. The audio summary for a document can thereafter be presented to a user so that the user can hear a summary of the document without having to process the document to produce its spoken text via text-to-speech conversion.
    Type: Grant
    Filed: February 25, 2015
    Date of Patent: April 5, 2016
    Assignee: Apple Inc.
    Inventors: Christopher Brian Fleizach, Reginald Dean Hudson
  • Patent number: 9293150
    Abstract: A portion of an audio signal is identified corresponding to a spoken word and its phonemes. A set of alternate spoken words satisfying phonetic similarity criteria to the spoken word is generated. A subset of the set of alternate spoken words is also identified; each member of the subset shares the same phoneme in a similar temporal position as the spoken word. A significance factor is then calculated for the phoneme based on the number of alternates in the subset and on the total number of alternates. The calculated significance factor may then be used to lengthen or shorten the temporal duration of the phoneme in the audio signal according to its significance in the spoken word.
    Type: Grant
    Filed: September 12, 2013
    Date of Patent: March 22, 2016
    Assignee: International Business Machines Corporation
    Inventors: Flemming Boegelund, Lav R. Varshney
  • Patent number: 9286913
    Abstract: Disclosed is an information display system provided with: a signal analyzing unit which analyzes the audio signals obtained from a predetermined location and which generates ambient sound information regarding the sound generated at the predetermined location; and an ambient expression selection unit which selects an ambient expression which expresses the content of what a person is feeling from the sound generated at the predetermined location on the basis of the ambient sound information.
    Type: Grant
    Filed: March 28, 2011
    Date of Patent: March 15, 2016
    Assignee: NEC CORPORATION
    Inventors: Toshiyuki Nomura, Yuzo Senda, Kyota Higa, Takayuki Arakawa, Yasuyuki Mitsui
  • Patent number: 9269347
    Abstract: A text-to-speech method configured to output speech having a selected speaker voice and a selected speaker attribute, including: inputting text; dividing the inputted text into a sequence of acoustic units; selecting a speaker for the inputted text; selecting a speaker attribute for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model; and outputting the sequence of speech vectors as audio with the selected speaker voice and a selected speaker attribute. The acoustic model includes a first set of parameters relating to speaker voice and a second set of parameters relating to speaker attributes, which parameters do not overlap. The selecting a speaker voice includes selecting parameters from the first set of parameters and the selecting the speaker attribute includes selecting the parameters from the second set of parameters.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: February 23, 2016
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine
  • Patent number: 9261952
    Abstract: The shifting and recharging of an emotional state with word sequencing is disclosed. A selection of a first word sequence set is received from the user. The word sequence set is defined by a mood recharging characteristic value, and includes a plurality of words each with at least one corresponding definition. A first one of the plurality of words in the selected first word sequence set is displayed. Then, a first one of the at least one corresponding definition of the first one of the plurality of words in the first word sequence set is displayed while the first one of the plurality of words remains displayed. The definition remains displayed for a time duration corresponding to a predefined cadence rate value. The user is prompted with a question related to the mood recharging characteristic value and associated with the first word sequence set.
    Type: Grant
    Filed: February 5, 2013
    Date of Patent: February 16, 2016
    Assignee: Spectrum Alliance, LLC
    Inventors: Pamela Gail Greene, David L. Greene, Mary Anne Thomas
  • Patent number: 9240178
    Abstract: A text-to-speech (TTS) system is configured with multiple voice corpuses used to synthesize speech. An incoming TTS request may be processed by a first, smaller, voice corpus to quickly return results to the user. The text of the request may be stored by the TTS system and then processed in the background using a second, larger, voice corpus. The second corpus takes longer to process but returns higher quality results. Future incoming TTS requests may be compared against the text of the first TTS request. If the text, or portions thereof match, the system may return stored results from the processing by the second corpus, thus returning high quality speech results in a shorter time.
    Type: Grant
    Filed: June 26, 2014
    Date of Patent: January 19, 2016
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Adam Franciszek Nadolski, Michal Krzysztof Kiedrowicz
  • Patent number: 9230536
    Abstract: A candidate voice segment sequence generator 1 generates candidate voice segment sequences 102 for an input language information sequence 101 by using DB voice segments 105 in a voice segment database 4. An output voice segment sequence determinator 2 calculates a degree of match between the input language information sequence 101 and each of the candidate voice segment sequences 102 by using a parameter 107 showing a value according to a cooccurrence criterion 106 for cooccurrence between the input language information sequence 101 and a sound parameter showing the attribute of each of a plurality of candidate voice segments in each of the candidate voice segment sequences 102, and determines an output voice segment sequence 103 on the basis of the degree of match.
    Type: Grant
    Filed: February 21, 2014
    Date of Patent: January 5, 2016
    Assignee: Mitsubishi Electric Corporation
    Inventors: Takahiro Otsuka, Keigo Kawashima, Satoru Furuta, Tadashi Yamaura
  • Patent number: 9195656
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for multilingual prosody generation. In some implementations, data indicating a set of linguistic features corresponding to a text is obtained. Data indicating the linguistic features and data indicating the language of the text are provided as input to a neural network that has been trained to provide output indicating prosody information for multiple languages. The neural network can be a neural network having been trained using speech in multiple languages. Output indicating prosody information for the linguistic features is received from the neural network. Audio data representing the text is generated using the output of the neural network.
    Type: Grant
    Filed: December 30, 2013
    Date of Patent: November 24, 2015
    Assignee: Google Inc.
    Inventors: Javier Gonzalvo Fructuoso, Andrew W. Senior, Byungha Chun
  • Patent number: 9190051
    Abstract: A Chinese speech recognition system and method is disclosed. Firstly, a speech signal is received and recognized to output a word lattice. Next, the word lattice is received, and word arcs of the word lattice are rescored and reranked with a prosodic break model, a prosodic state model, a syllable prosodic-acoustic model, a syllable-juncture prosodic-acoustic model and a factored language model, so as to output a language tag, a prosodic tag and a phonetic segmentation tag, which correspond to the speech signal. The present invention performs rescoring in a two-stage way to promote the recognition rate of basic speech information and labels the language tag, prosodic tag and phonetic segmentation tag to provide the prosodic structure and language information for the rear-stage voice conversion and voice synthesis.
    Type: Grant
    Filed: April 13, 2012
    Date of Patent: November 17, 2015
    Assignee: NATIONAL CHIAO TUNG UNIVERSITY
    Inventors: Jyh-Her Yang, Chen-Yu Chiang, Ming-Chieh Liu, Yih-Ru Wang, Yuan-Fu Liao, Sin-Horng Chen
  • Patent number: 9177560
    Abstract: A system and method may be configured to reconstruct an audio signal from transformed audio information. The audio signal may be resynthesized based on individual harmonics and corresponding pitches determined from the transformed audio information. Noise may be subtracted from the transformed audio information by interpolating across peak points and across trough points of harmonic pitch paths through the transformed audio information, and subtracting values associated with the trough point interpolations from values associated with the peak point interpolations. Noise between harmonics of the sound may be suppressed in the transformed audio information by centering functions at individual harmonics in the transformed audio information, the functions serving to suppress noise between the harmonics.
    Type: Grant
    Filed: December 22, 2014
    Date of Patent: November 3, 2015
    Assignee: The Intellisis Corporation
    Inventors: David C. Bradley, Daniel S. Goldin, Robert N. Hilton, Nicholas K. Fisher, Rodney Gateau
  • Patent number: 9135909
    Abstract: A speech synthesis information editing apparatus is provided. The speech synthesis information editing apparatus includes a phoneme storage unit that stores phoneme information, which designates a duration of each phoneme of speech to be synthesized. The speech synthesis information editing apparatus also includes a feature storage unit that stores feature information, which designates a time variation in a feature of the speech. In addition, the speech synthesis information editing apparatus includes an edition processing unit that changes a duration of each phoneme designated by the phoneme information with an expansion/compression degree, based on a feature designated by the feature information in correspondence to the phoneme.
    Type: Grant
    Filed: December 1, 2011
    Date of Patent: September 15, 2015
    Assignee: Yamaha Corporation
    Inventor: Tatsuya Iriyama
  • Patent number: 9123319
    Abstract: Systems, including methods and apparatus, for generating audio effects based on accompaniment audio produced by live or pre-recorded accompaniment instruments, in combination with melody audio produced by a singer. Audible broadcast of the accompaniment audio may be delayed by a predetermined time, such as the time required to determine chord information contained in the accompaniment signal. As a result, audio effects that require the chord information may be substantially synchronized with the audible broadcast of the accompaniment audio. The present teachings may be especially suitable for use in karaoke systems, to correct and add sound effects to a singer's voice that sings along with a pre-recorded accompaniment track.
    Type: Grant
    Filed: August 25, 2014
    Date of Patent: September 1, 2015
    Assignee: Sing Trix LLC
    Inventors: David Kenneth Hilderman, John Devecka
  • Patent number: 9058061
    Abstract: A touch panel 1 is arranged on the front face of a key display unit 2 and accepts an input to a numeric keypad displayed on the key display unit 2. When a key of the numeric keypad displayed on the key display unit 2 is pressed, the control unit 4 measures the time for which the key is kept pressed long with a timer 5 and updates characters assigned to the key one by one by regarding that the key is pressed once each time a predetermined period of time passes while the key is kept pressed long, and then displays each character on the display unit 3. Further, the control unit 4 notifies a vibration unit 6 of the timing of updating the character each time the predetermined time passes while the key is kept pressed long. The vibration unit 6 vibrates the touch panel 1 based on the timing of updating the character notified by the control unit 4. Thus a mobile terminal that enables a user to input a desired character without watching the mobile terminal is provided.
    Type: Grant
    Filed: August 8, 2008
    Date of Patent: June 16, 2015
    Assignee: Kyocera Corporation
    Inventors: Tomotake Aono, Tetsuya Takenaka
  • Patent number: 9037466
    Abstract: Methods, systems, and computer program products are provided for email administration for rendering email on a digital audio player. Embodiments include retrieving an email message; extracting text from the email message; creating a media file; and storing the extracted text of the email message as metadata associated with the media file. Embodiments may also include storing the media file on a digital audio player and displaying the metadata describing the media file, the metadata containing the extracted text of the email message.
    Type: Grant
    Filed: March 9, 2006
    Date of Patent: May 19, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: William K. Bodin, David Jaramillo, Jerry W. Redman, Derral C. Thorson