Patents by Inventor Michal Tadeusz Kaszczuk

Michal Tadeusz Kaszczuk has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9978359
    Abstract: A text-to-speech (TTS) processing system may be configured for iterative processing. Speech units for unit selection may be tagged according to extra segmental features, such as emotional features, dramatic features, etc. Preliminary TTS results based on input text may be provided to a user through a user interface. The user may offer corrections to the preliminary results. Those corrections may correspond to the extra segmental features. The user corrections may then be input into the TTS system along with the input text to provide refined TTS results. This process may be repeated iteratively to obtain desired TTS results.
    Type: Grant
    Filed: December 6, 2013
    Date of Patent: May 22, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Michal Tadeusz Kaszczuk, Jeffrey Penrod Adams, Adam Franciszek Nadolski
  • Patent number: 9704476
    Abstract: In a distributed text-to-speech (TTS) system, a remote TTS device, such as a TTS server, may experience increased loads of TTS requests, which may result in delayed processing of TTS requests. To avoid such delays, upon indication or prediction of an increased load, a TTS server may adjust unit selection TTS processing by altering unit selection techniques to speed processing, at the expense of potential result quality. Such techniques may include use of a reduced size unit database, a narrow Viterbi beam search, and/or a reduced size candidate unit graph.
    Type: Grant
    Filed: June 27, 2013
    Date of Patent: July 11, 2017
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Krzysztof Franciszek Swietlinski, Michal Tadeusz Kaszczuk
  • Patent number: 9646601
    Abstract: In delivering text-to-speech (TTS) results to a user, the time between the user request and delivery of initial TTS results is reduced using one or more of various techniques. Caching of TTS results may be reconfigured to cache unit indices rather than full speech synthesis results. More powerful computing resources may be dedicated to early TTS processing. A user may be notified of TTS results prior to complete processing of a TTS request. Early TTS processing may be performed by a local device and then passed to a remote device.
    Type: Grant
    Filed: July 26, 2013
    Date of Patent: May 9, 2017
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Jacek Jerzy Jedrzejczak, Krzysztof Franciszek Swietlinski, Michal Tadeusz Kaszczuk, Lukasz Maciej Osowski
  • Patent number: 9508338
    Abstract: A text-to-speech (TTS) system may be configured to incorporate breath sounds in the output speech. By incorporating breath sounds into speech output from text a TTS system may be able to mimic more naturally sounding human speech, particularly for long-form narration of text longer than short phrases. The breath sounds may be stored as units for unit selection or may be generated during parametric synthesis. The acoustic features of the breath sounds and duration between breaths may depend upon the punctuation of text, the linguistic distance between breaths, the breaks between intonational phrases, the linguistic context of the breaths, and other factors.
    Type: Grant
    Filed: November 15, 2013
    Date of Patent: November 29, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Michal Tadeusz Kaszczuk, Maciej Tegi, Michal Czuczman, Remus Razvan Mois
  • Patent number: 9484014
    Abstract: In a text-to-speech (TTS) system, a database including sample speech units for unit selection may be include both units represented by sample audio segments as well as parametric representations of units created by Hidden Markov Models (HMMs). Inclusion of parametric representations in the database may reduce the storage necessary to maintain the database. The parametric representations may be configured to match a voice of the audio segments. The parametric representations may correspond to phonetic units that are less frequently encountered in TTS processing, such as rare diphones or phonemes corresponding to foreign languages. Multiple foreign language HMM models may be used to enable polyglot synthesis with a reduction in storage capacity requirements. Parametrically stored speech units may be combined with speech segments generated during processing time by a parametric model.
    Type: Grant
    Filed: February 20, 2013
    Date of Patent: November 1, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Michal Tadeusz Kaszczuk, Lukasz Maciej Osowski
  • Patent number: 9311912
    Abstract: Text-to-speech (TTS) processing systems may be divided among remote TTS servers which are accessible through a network connection to local user devices. The costs for performing processing on these servers may vary according to time. To improve efficiency of TTS processing certain requests may be scheduled during low cost server times. A user may indicate a preference for such low cost delivery. A user may also indicate a preference for quick turnaround time, permitting scheduling of TTS processing during higher cost server times. A TTS processing system may also consider quality of TTS results when scheduling server processing time for a particular TTS request and may allocate more server time when higher quality results are desired.
    Type: Grant
    Filed: July 22, 2013
    Date of Patent: April 12, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Krzysztof Franciszek Swietlinski, Michal Tadeusz Kaszczuk