Synthesis Patents (Class 704/258)
  • Patent number: 10803851
    Abstract: The present disclosure provides a method for processing speech splicing and synthesis and apparatus, a computer device and a readable medium. The method comprises: expanding a speech library according to a pre-trained speech synthesis model and an obtained synthesized text; the speech library before the expansion comprises manually-collected original language materials; using the expanded speech library to perform speech splicing and synthesis processing. According to the technical solution of the present embodiment, the speech library is expanded so that the speech library includes sufficient language materials. As such, when speech splicing processing is performed according to the expanded speech library, it is possible to select more speech segments, and thereby improve coherence and naturalness of the effect of speech synthesis so that the speech synthesis effect is very coherent with very good naturalness and can sufficiently satisfy the user's normal use.
    Type: Grant
    Filed: December 19, 2018
    Date of Patent: October 13, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Xiaohui Sun, Yu Gu
  • Patent number: 10803850
    Abstract: Techniques for generating voice with predetermined emotion type. In an aspect, semantic content and emotion type are separately specified for a speech segment to be generated. A candidate generation module generates a plurality of emotionally diverse candidate speech segments, wherein each candidate has the specified semantic content. A candidate selection module identifies an optimal candidate from amongst the plurality of candidate speech segments, wherein the optimal candidate most closely corresponds to the predetermined emotion type. In further aspects, crowd-sourcing techniques may be applied to generate the plurality of speech output candidates associated with a given semantic content, and machine-learning techniques may be applied to derive parameters for a real-time algorithm for the candidate selection module.
    Type: Grant
    Filed: September 8, 2014
    Date of Patent: October 13, 2020
    Inventors: Chi-Ho Li, Baoxun Wang, Max Leung
  • Patent number: 10783895
    Abstract: A method and device are provided for determining an optimized scale factor to be applied to an excitation signal or a filter during a process for frequency band extension of an audio frequency signal. The band extension process includes decoding or extracting, in a first frequency band, an excitation signal and parameters of the first frequency band including coefficients of a linear prediction filter, generating an excitation signal extending over at least one second frequency band, filtering using a linear prediction filter for the second frequency band. The determination method includes determining an additional linear prediction filter, of a lower order than that of the linear prediction filter of the first frequency band, the coefficients of the additional filter being obtained from the parameters decoded or extracted from the first frequency and calculating the optimized scale factor as a function of at least the coefficients of the additional filter.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: September 22, 2020
    Assignee: Koninklijke Philips N.V.
    Inventors: Magdalena Kaniewska, Stephane Ragot
  • Patent number: 10776586
    Abstract: A system, computer program product, and method are provided to automate a framework for knowledge graph based persistence of data, and to resolve temporal changes and uncertainties in the knowledge graph. Natural language understanding, together with one or more machine learning models (MLMs), is used to extract data from unstructured information, including entities and entity relationships. The extracted data is populated into a knowledge graph. As the KG is subject to change, the KG is used to create new and retrain existing machine learning models (MLMs). Weighting is applied to the populated data in the form of veracity value. Blockchain technology is applied to the populated data to ensure reliability of the data and to provide auditability to assess changes to the data.
    Type: Grant
    Filed: January 10, 2018
    Date of Patent: September 15, 2020
    Assignee: International Business Machines Corporation
    Inventors: David Bacarella, James H. Barnebee, IV, Nicholas Lawrence, Sumit Patel
  • Patent number: 10755694
    Abstract: An electronic device includes an audio synthesizer. The audio synthesizer can generate a voice-synthesized audio output stream as a function of one or more audible characteristics extracted from voice input received from an authorized user of the electronic device. The audio synthesizer can also apply an acoustic watermark to the voice-synthesized audio output stream, the acoustic watermark indicating that the voice-synthesized audio output stream is machine made.
    Type: Grant
    Filed: March 15, 2018
    Date of Patent: August 25, 2020
    Assignee: Motorola Mobility LLC
    Inventors: Rachid Alameh, James P Ashley, Jarrett Simerson, Thomas Merrell
  • Patent number: 10755597
    Abstract: Disclosed are a method and an apparatus for calculating a meal period, the method including: calculating, by a wrist acceleration calculating unit, a wrist acceleration variation value which is a variation value of acceleration in respect to a motion of a user's wrist which is measured based on gravitational acceleration; calculating, by a wrist angle calculating unit, a wrist angle variation value which is a variation value of an angle to the user's wrist based on a gravitational direction by using the wrist acceleration variation value; detecting, by an eating behavior candidate pattern detecting unit, an eating behavior candidate pattern based on a predetermined reference by applying one or more threshold values to the wrist angle variation value; and calculating, by a meal period calculating unit, a meal period based on the number of times the eating behavior candidate pattern occurs.
    Type: Grant
    Filed: April 27, 2017
    Date of Patent: August 25, 2020
    Assignee: AJOU UNIVERSITY INDUSTRY-ACADEMIC COOPERATION FOUNDATION
    Inventors: We Duke Cho, Kyeong Chan Park, Sun Taag Choe
  • Patent number: 10726838
    Abstract: One aspect of this disclosure relates to presentation of a first effect on one or more presentation devices during an oral recitation of a first story. The first effect is associated with a first trigger point, first content, and/or first story. The first trigger point being one or more specific syllables from a word and/or phrase in the first story. A first transmission point associated with the first effect can be determined based on a latency of a presentation device and user speaking profile. The first transmission point being one or more specific syllables from a word and/or phrase before the first trigger point in the first story. Control signals for instructions to present the first content at the first trigger point are transmitted to the presentation device when a user recites the first transmission point such that first content is presented at the first trigger point.
    Type: Grant
    Filed: June 14, 2018
    Date of Patent: July 28, 2020
    Assignee: Disney Enterprises, Inc.
    Inventors: Taylor Hellam, Malcolm E. Murdock, Mohammad Poswal, Nicolas Peck
  • Patent number: 10672412
    Abstract: A method and device are provided for determining an optimized scale factor to be applied to an excitation signal or a filter during a process for frequency band extension of an audio frequency signal. The band extension process includes decoding or extracting, in a first frequency band, an excitation signal and parameters of the first frequency band including coefficients of a linear prediction filter, generating an excitation signal extending over at least one second frequency band, filtering using a linear prediction filter for the second frequency band. The determination method includes determining an additional linear prediction filter, of a lower order than that of the linear prediction filter of the first frequency band, the coefficients of the additional filter being obtained from the parameters decoded or extracted from the first frequency and calculating the optimized scale factor as a function of at least the coefficients of the additional filter.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: June 2, 2020
    Assignee: Koninklijke Philips N.V.
    Inventors: Magdalena Kaniewska, Stephane Ragot
  • Patent number: 10665226
    Abstract: Systems, methods, and computer-readable storage devices for generating speech using a presentation style specific to a user, and in particular the user's social group. Systems configured according to this disclosure can then use the resulting, personalized, text and/or speech in a spoken dialogue or presentation system to communicate with the user. For example, a system practicing the disclosed method can receive speech from a user, identify the user, and respond to the received speech by applying a personalized natural language generation model. The personalized natural language generation model provides communications which can be specific to the identified user.
    Type: Grant
    Filed: June 4, 2019
    Date of Patent: May 26, 2020
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Taniya Mishra, Alistair D. Conkie, Svetlana Stoyanchev
  • Patent number: 10650803
    Abstract: A method, a computer program product, and a computer system for mapping between a speech signal and a transcript of the speech signal. The computer system segments the speech signal to obtain one or more segmented speech signals and the transcript of the speech signal to obtain one or more segmented transcripts of the speech signal. The computer system generates estimated phone sequences and reference phone sequences, calculates costs of correspondences between the estimated phone sequences and the reference phone sequences, determines a series of the estimated phone sequences with a smallest cost, selects a partial series of the estimated phone sequences from the series of the estimated phone sequences, and generates mapping data which includes the partial series of the estimated phone sequences and a corresponding series of the reference phone sequences.
    Type: Grant
    Filed: October 10, 2017
    Date of Patent: May 12, 2020
    Assignee: International Business Machines Corporation
    Inventors: Takashi Fukuda, Nobuyasu Itoh
  • Patent number: 10636429
    Abstract: In some embodiments, a system may process a user interface to identify textual or graphical items in the interface, and may prepare a plurality of audio files containing spoken representations of the items. As the user navigates through the interface, different ones of the audio files may be selected and played, to announce text associated with items selected by the user. A computing device may periodically determine whether a cache offering the interface to users stores audio files for all of the interface's textual items, and if the cache is missing any audio files for any of the textual items, the computing device may take steps to have a corresponding audio file created.
    Type: Grant
    Filed: March 3, 2017
    Date of Patent: April 28, 2020
    Assignee: Comcast Cable Communications, LLC
    Inventors: Thomas Wlodkowski, Michael J. Cook
  • Patent number: 10628985
    Abstract: Techniques are described for image generation for avatar image animation using translation vectors. An avatar image is obtained for representation on a first computing device. An autoencoder is trained, on a second computing device comprising an artificial neural network, to generate synthetic emotive faces. A plurality of translation vectors is identified corresponding to a plurality of emotion metrics, based on the training. A bottleneck layer within the autoencoder is used to identify the plurality of translation vectors. A subset of the plurality of translation vectors is applied to the avatar image, wherein the subset represents an emotion metric input. The emotion metric input is obtained from facial analysis of an individual. An animated avatar image is generated for the first computing device, based on the applying, wherein the animated avatar image is reflective of the emotion metric input and the avatar image includes vocalizations.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: April 21, 2020
    Assignee: Affectiva, Inc.
    Inventors: Taniya Mishra, George Alexander Reichenbach, Rana el Kaliouby
  • Patent number: 10614826
    Abstract: A method of building a speech conversion system uses target information from a target voice and source speech data. The method receives the source speech data and the target timbre data, which is within a timbre space. A generator produces first candidate data as a function of the source speech data and the target timbre data. A discriminator compares the first candidate data to the target timbre data with reference to timbre data of a plurality of different voices. The discriminator determines inconsistencies between the first candidate data and the target timbre data. The discriminator produces an inconsistency message containing information relating to the inconsistencies. The inconsistency message is fed back to the generator, and the generator produces a second candidate data. The target timbre data in the timbre space is refined using information produced by the generator and/or discriminator as a result of the feeding back.
    Type: Grant
    Filed: May 24, 2018
    Date of Patent: April 7, 2020
    Assignee: Modulate, Inc.
    Inventors: William Carter Huffman, Michael Pappas
  • Patent number: 10584386
    Abstract: The present invention relates to coding of audio signals, and in particular to high frequency reconstruction methods including a frequency domain harmonic transposer. A system and method for generating a high frequency component of a signal from a low frequency component of the signal is described.
    Type: Grant
    Filed: December 18, 2018
    Date of Patent: March 10, 2020
    Assignee: Dolby International AB
    Inventors: Lars Villemoes, Per Ekstrand
  • Patent number: 10586079
    Abstract: Software-based systems perform parametric speech synthesis. TTS voice parameters determine the generated speech audio. Voice parameters include gender, age, dialect, donor, arousal, authoritativeness, pitch, range, speech rate, volume, flutter, roughness, breath, frequencies, bandwidths, and relative amplitudes of formants and nasal sounds. The system chooses TTS parameters based on one or more of: user profile attributes including gender, age, and dialect; situational attributes such as location, noise level, and mood; natural language semantic attributes such as domain of conversation, expression type, dimensions of affect, word emphasis and sentence structure; and analysis of target speaker voices. The system chooses TTS parameters to improve listener satisfaction or other desired listener behavior. Choices may be made by specified algorithms defined by code developers, or by machine learning algorithms trained on labeled samples of system performance.
    Type: Grant
    Filed: January 13, 2017
    Date of Patent: March 10, 2020
    Assignee: SOUNDHOUND, INC.
    Inventors: Monika Almudafar-Depeyrot, Bernard Mont-Reynaud
  • Patent number: 10579742
    Abstract: Techniques are described for data transformation performed based on a current emotional state of the user who provided input data, the emotional state determined based on biometric data for the user. Sensor(s) may generate biometric data that indicates physiological characteristic(s) of the user, and an emotional state of the user is determined based on the biometric data. Different dictionaries and/or dictionary entries may be used in translation, depending on the emotional state of the sender when the data was input. In some implementations, the emotional state of the sending user may be used to infer or otherwise determine that a translation was incorrect. The input data may be transformed to include information indicating the current emotional state of the sending user when they provided the input data. For example, the output text may be presented in a user interface with an icon and/or other indication of the sender's emotional state.
    Type: Grant
    Filed: August 8, 2017
    Date of Patent: March 3, 2020
    Assignee: United Services Automobile Association (USAA)
    Inventor: Amanda S. Fernandez
  • Patent number: 10573307
    Abstract: A syntactic analysis unit 104 performs a syntactic analysis for linguistic information on acquired user' speech (hereinafter simply referred to as “user speech”). A non-linguistic information analysis unit 106 analyzes non-linguistic information different from the linguistic information for the acquired user speech. A topic continuation determination unit 110 determines whether a topic of the current conversation should be continued or should be changed to a different topic according to the non-linguistic information analysis result. A response generation unit 120 generates a response according to a result of a determination by the topic continuation determination unit 110.
    Type: Grant
    Filed: October 30, 2017
    Date of Patent: February 25, 2020
    Assignees: Furhat Robotics AB, TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Gabriel Skantze, Martin Johansson, Tatsuro Hori, Narimasa Watanabe
  • Patent number: 10546573
    Abstract: To prioritize the processing text-to-speech (TTS) tasks, a TTS system may determine, for each task, an amount of time prior to the task reaching underrun, that is the time before the synthesized speech output to a user catches up to the time since a TTS task was originated. The TTS system may also prioritize tasks to reduce the amount of time between when a user submits a TTS request and when results are delivered to the user. When prioritizing tasks, such as allocating resources to existing tasks or accepting new tasks, the TTS system may prioritize tasks with the lowest amount of time prior to underrun and/or tasks with the longest time prior to delivery of first results.
    Type: Grant
    Filed: August 10, 2017
    Date of Patent: January 28, 2020
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventor: Bartosz Putrycz
  • Patent number: 10529318
    Abstract: A method, system, and computer program product for learning a recognition model for recognition processing. The method includes preparing one or more examples for learning, each of which includes an input segment, an additional segment adjacent to the input segment and an assigned label. The input segment and the additional segment are extracted from an original training data. A classification model is trained, using the input segment and the additional segment in the examples, to initialize parameters of the classification model so that extended segments including the input segment and the additional segment are reconstructed from the input segment. Then, the classification model is tuned to predict a target label, using the input segment and the assigned label in the examples, based on the initialized parameters. At least a portion of the obtained classification model is included in the recognition model.
    Type: Grant
    Filed: July 31, 2015
    Date of Patent: January 7, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Gakuto Kurata
  • Patent number: 10529314
    Abstract: A speech synthesizer includes a statistical-model sequence generator, a multiple-acoustic feature parameter sequence generator, and a waveform generator. The statistical-model sequence generator generates, based on context information corresponding to an input text, a statistical model sequence that comprises a first sequence of a statistical model comprising a plurality of states. The multiple-acoustic feature parameter sequence generator, for each speech section corresponding to each state of the statistical model sequence, selects a first plurality of acoustic feature parameters from a first set of acoustic feature parameters extracted from a first speech waveform stored in a speech database and generates a multiple-acoustic feature parameter sequence that comprises a sequence of the first plurality of acoustic feature parameters.
    Type: Grant
    Filed: February 16, 2017
    Date of Patent: January 7, 2020
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Masahiro Morita
  • Patent number: 10497362
    Abstract: A system and method are presented for outlier identification to remove poor alignments in speech synthesis. The quality of the output of a text-to-speech system directly depends on the accuracy of alignments of a speech utterance. The identification of mis-alignments and mis-pronunciations from automated alignments may be made based on fundamental frequency methods and group delay based outlier methods. The identification of these outliers allows for their removal, which improves the synthesis quality of the text-to-speech system.
    Type: Grant
    Filed: February 26, 2018
    Date of Patent: December 3, 2019
    Inventors: E. Veera Raghavendra, Aravind Ganapathiraju
  • Patent number: 10481860
    Abstract: A Solar Tablet verbal with nano scale layers, lithium battery a solar MP3 player, e-books reader, e-newspaper reader, and e-magazine reader. All units are operable by verbal command and can work manually from an ultra-high definition touch screen. The solar technology utilizes the Photo electric effect with nano scale layers to boost solar cell efficiency. The tablet has encryption software.
    Type: Grant
    Filed: May 29, 2015
    Date of Patent: November 19, 2019
    Inventor: Gregory Walker Johnson
  • Patent number: 10468022
    Abstract: A voice assistant (VA) can switch between a voice input mode, in which the VA produces audible responses to voice queries, and a gesture input mode that can be triggered by a predetermined gesture, in which the VA produces visual responses to gesture-based queries.
    Type: Grant
    Filed: April 3, 2017
    Date of Patent: November 5, 2019
    Assignee: Motorola Mobility LLC
    Inventors: Jun-ki Min, Mir Farooq Ali, Navin Tulsibhai Dabhi
  • Patent number: 10453478
    Abstract: A sound quality determination device includes an acquisition unit acquiring an input sound, a frequency distribution calculation unit calculating a frequency distribution of the input sound acquired by the acquisition unit, a tilt calculation unit calculating a tilt indicating a change in intensity of an overtone with respect to a frequency based on the frequency distribution calculated by the frequency distribution calculation unit, a tilt comparison unit comparing the tilt calculated by the tilt calculation unit and a threshold related to the tilt, and a determination unit determining based on a result of comparison by the tilt comparison unit whether the input sound has a predetermined sound quality.
    Type: Grant
    Filed: March 14, 2018
    Date of Patent: October 22, 2019
    Assignee: YAMAHA CORPORATION
    Inventor: Ryuichi Nariyama
  • Patent number: 10448185
    Abstract: A multi-channel decorrelator for providing a plurality of decorrelated signals on the basis of a plurality of decorrelator input signals is configured to premix a first set of N decorrelator input signals into a second set of K decorrelator input signals, wherein K<N. The multi-channel decorrelator is configured to provide a first set of K? decorrelator output signals on the basis of the second set of K decorrelator input signals. The multi-channel decorrelator is further configured to upmix the first set of K? decorrelator output signals into a second set of N? decorrelator output signals, wherein N?>K?. The multi-channel decorrelator can be used in a multi-channel audio decoder. A multi-channel audio encoder provides complexity control information for the multi-channel decorrelator.
    Type: Grant
    Filed: April 25, 2016
    Date of Patent: October 15, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Harald Fuchs, Oliver Hellmuth, Juergen Herre, Adrian Murtaza, Jouni Paulus, Falko Ridderbusch, Leon Terentiv
  • Patent number: 10394885
    Abstract: A personalized financial podcast generation system, the system includes a user data module configured to acquire user data associated with a user and analyze the user data to identify a keyword associated with a financial topic of interest to the user. The system also includes a keyword analyzer configured to calculate a weight of the keyword. The system further includes a content analyzer configured to identify financial media content based on the keyword and the weight. Moreover, the system includes a summarizer configured to identify a relevant sentence in the financial media content. In addition, the system includes a speech processor configured to synthesize speech based on the relevant sentence.
    Type: Grant
    Filed: March 15, 2016
    Date of Patent: August 27, 2019
    Assignee: INTUIT INC.
    Inventors: Wolfgang Paulus, Cynthia Joann Osmon, Diane L. Weiss, Jacob N. Huffman
  • Patent number: 10381016
    Abstract: Methods, systems and computer readable media for altering an audio output are provided. In some embodiments, the system may change the original frequency content of an audio data file to a second frequency content so that a recorded audio track will sound as if a different person had recorded it when it is played back. In other embodiments, the system may receive an audio data file and a voice signature, and it may apply the voice signature to the audio data file to alter the audio output of the audio data file. In that instance, the audio data file may be a textual representation of a recorded audio data file.
    Type: Grant
    Filed: March 29, 2016
    Date of Patent: August 13, 2019
    Assignee: Apple Inc.
    Inventor: Michael M. Lee
  • Patent number: 10375465
    Abstract: A computer-program product embodied in a non-transitory computer readable medium that is programmed to communicate with a listener of headphones is provided. The computer-program product includes instructions to receive ambient noise indicative of external noise to a listener's headphone and to extract a speech component from the ambient noise. The computer-program product further includes instructions to derive an intent from the speech component of the ambient noise and compare the intent to at least one user defined preference. The computer-program product further including instructions to transmit an alert to notify a listener that the intent of the speech component matches the at least one user defined preference.
    Type: Grant
    Filed: September 14, 2016
    Date of Patent: August 6, 2019
    Assignee: Harman International Industries, Inc.
    Inventor: Pratyush Sahay
  • Patent number: 10366707
    Abstract: Mechanisms, in a natural language processing (NLP) system are provided. The NLP system receives a plurality of communications associated with a communication system, over a predetermined time period, from a plurality of end user devices. The NLP system identifies, for each communication in the plurality of communications, a user submitting the communication to thereby generate a set of users comprising a plurality of users associated with the plurality of communications. The NLP system retrieves a user model for each user in the set of users, which specifies at least one attribute of a corresponding user. The NLP system generates an aggregate user model that aggregates the at least one attribute of each user in the set of users together to generate an aggregate representation of the attributes of the plurality of users in the set of users. The NLP system performs a cognitive operation based on the aggregate user model.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: July 30, 2019
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Laura J. Rodriguez
  • Patent number: 10359939
    Abstract: Embodiments of the present invention provide a data object processing method and apparatus, which can divide a data object into one or more blocks; calculate a sample compression ratio of each block, aggregate neighboring consecutive blocks with a same sample compression ratio characteristic into one data segment, and obtain the sample compression ratio of each of the data segments; and select, according to a length range to which a length of each of the data segments belongs and a compression ratio range to which the sample compression ratio of each of the data segments belongs, an expected length to divide the data segment into data chunks, where the sample compression ratio of each of the data segments uniquely belongs to one of the compression ratio ranges, and the length of each of the data segments uniquely belongs to one of the length ranges.
    Type: Grant
    Filed: July 16, 2015
    Date of Patent: July 23, 2019
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Jiansheng Wei, Junhua Zhu
  • Patent number: 10318586
    Abstract: Systems and methods are disclosed herein for processing a natural language query. A receiver circuitry receives the natural language query from a user. A natural language interpreter circuitry parses the natural language query to convert the natural language query into a plurality of categories and a plurality of variables, each variable in the plurality of variables corresponding to one category in the plurality of categories. A user interface displays to the user the plurality of categories and the plurality of variables, and allows the user to modify at least one variable in the plurality of variables by providing a natural language utterance.
    Type: Grant
    Filed: August 19, 2014
    Date of Patent: June 11, 2019
    Assignee: Google LLC
    Inventors: Robert Brett Rose, Gregory Brandon Owen, Keith Charles Bottner
  • Patent number: 10319370
    Abstract: Systems, methods, and computer-readable storage devices for generating speech using a presentation style specific to a user, and in particular the user's social group. Systems configured according to this disclosure can then use the resulting, personalized, text and/or speech in a spoken dialogue or presentation system to communicate with the user. For example, a system practicing the disclosed method can receive speech from a user, identify the user, and respond to the received speech by applying a personalized natural language generation model. The personalized natural language generation model provides communications which can be specific to the identified user.
    Type: Grant
    Filed: May 14, 2018
    Date of Patent: June 11, 2019
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Taniya Mishra, Alistair D. Conkie, Svetlana Stoyanchev
  • Patent number: 10272307
    Abstract: Disclosed is system and method for measuring the tension of a tennis net, and, alternatively or in addition, for determining if a service let occurs via the measuring of the net tension. The disclosed embodiments measure a force exerted on the center-strap or the singles stick by the net. In these embodiments, the measured force provides an accurate reflection of the tension of the net.
    Type: Grant
    Filed: September 7, 2018
    Date of Patent: April 30, 2019
    Assignee: GROUP ONE LIMITED
    Inventor: Fredric Goldstein
  • Patent number: 10262651
    Abstract: Multi-voice font interpolation is provided. A multi-voice font interpolation engine allows the production of computer generated speech with a wide variety of speaker characteristics and/or prosody by interpolating speaker characteristics and prosody from existing fonts. Using prediction models from multiple voice fonts, the multi-voice font interpolation engine predicts values for the parameters that influence speaker characteristics and/or prosody for the phoneme sequence obtained from the text to spoken. For each parameter, additional parameter values are generated by a weighted interpolation from the predicted values. Modifying an existing voice font with the interpolated parameters changes the style and/or emotion of the speech while retaining the base sound qualities of the original voice.
    Type: Grant
    Filed: September 9, 2016
    Date of Patent: April 16, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jian Luan, Lei He, Max Leung
  • Patent number: 10235125
    Abstract: An audio playback control method and a terminal device are described. The method includes starting an application, and playing a background audio of the application, acquiring a foreground audio, and determining duration and volume of the foreground audio. If the duration of the foreground audio is greater than a first threshold and the volume of the foreground audio is greater than a second threshold, the method can reduce volume of the background audio.
    Type: Grant
    Filed: June 9, 2016
    Date of Patent: March 19, 2019
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Xiaorong Chen, Kai Xu, Longfeng Wei
  • Patent number: 10229668
    Abstract: Methods and systems are described in which spoken voice prompts can be produced in a manner such that they will most likely have the desired effect, for example to indicate empathy, or produce a desired follow-up action from a call recipient. The prompts can be produced with specific optimized speech parameters, including duration, gender of speaker, and pitch, so as to encourage participation and promote comprehension among a wide range of patients or listeners. Upon hearing such voice prompts, patients/listeners can know immediately when they are being asked questions that they are expected to answer, and when they are being given information, as well as the information that considered sensitive.
    Type: Grant
    Filed: October 30, 2017
    Date of Patent: March 12, 2019
    Assignee: Eliza Corporation
    Inventors: Lisa Lavoie, Lucas Merrow, Alexandra Drane, Frank Rizzo, Ivy Krull
  • Patent number: 10224894
    Abstract: An audio encoding device and an audio decoding device are described herein. The audio encoding device may examine a set of audio channels/channel groups representing a piece of sound program content and produce a set of ducking values to associate with one of the channels/channel groups. During playback of the piece of sound program content, the ducking values may be applied to all other channels/channel groups. Application of these ducking values may cause (1) the reduction in dynamic range of ducked channels/channel groups and/or (2) movement of channels/channel groups in the sound field. This ducking may improve intelligibility of audio in the non-ducked channel/channel group. For instance, a narration channel/channel group may be more clearly heard by listeners through the use of selective ducking of other channels/channel groups during playback.
    Type: Grant
    Filed: May 15, 2017
    Date of Patent: March 5, 2019
    Assignee: Apple Inc.
    Inventors: Tomlinson M. Holman, Frank M. Baumgarte, Eric A. Allamanche
  • Patent number: 10224021
    Abstract: A voice synthesizing apparatus includes: a voice inputter (102) configured to input a voice; an obtainer (22) configured to obtain a primary response to the voice inputted by the voice inputter (102); an analyzer (112) configured to analyze whether the primary response includes a repetition target; and a voice synthesizer (24) configured to, in a case where the analyzed primary response is determined to include the repetition target, synthesize a voice from a secondary response that includes the repetition target repeated at least twice to output the voice.
    Type: Grant
    Filed: July 2, 2015
    Date of Patent: March 5, 2019
    Assignee: Yamaha Corporation
    Inventor: Hiroaki Matsubara
  • Patent number: 10217457
    Abstract: In one embodiment, a semantic classifier input and a corresponding label attributed to the semantic classifier input may be obtained. A determination may be made whether the corresponding label is correct based on logged interaction data. An entry of an adaptation corpus may be generated based on a result of the determination. Operation of the semantic classifier may be adapted based on the adaptation corpus.
    Type: Grant
    Filed: April 10, 2017
    Date of Patent: February 26, 2019
    Assignees: AT&T INTELLECTUAL PROPERTY II, L.P., RUTGERS, THE STATE UNIVERSITY OF NEW JERSEY
    Inventors: Mazin Gilbert, Esther Levin, Michael Lederman Littman, Robert E. Schapire
  • Patent number: 10216732
    Abstract: An information presentation method, a non-transitory recording medium storing thereon a computer program, and an information presentation system relate to speech recognition. A speech recognition unit performs speech recognition on speech pertaining to a dialogue and thereby generates dialogue text, a translation unit translates the dialogue text and thereby generates translated dialogue text, and a speech waveform synthesis unit performs speech synthesis on the translated dialogue text and thereby generates translated dialogue speech. An intention understanding unit then determines whether supplementary information exists, based on the dialogue text. If supplementary information exists, a communication unit transmits the supplementary information and the translated dialogue speech to a terminal to present the existence of the supplementary information to at least one person from among a plurality of people, according to the usage situation of the information presentation system of the at least one person.
    Type: Grant
    Filed: July 5, 2017
    Date of Patent: February 26, 2019
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Koji Miura, Masakatsu Hoshimi
  • Patent number: 10176811
    Abstract: A method and an apparatus of extracting voiceprint information based on neural network are disclosed. The method includes: extracting a phonetic acoustic feature from an input voice segment; inputting the phonetic acoustic feature into a voiceprint model trained based on a neural network, and extracting a bottleneck feature of the neural network in the voiceprint model; and mapping frame vectors of the bottleneck feature of the neural network into a single-frame voiceprint expression vector, which serves as voiceprint information corresponding to the input voice segment. The neural network-based voiceprint information extraction method and apparatus extract voiceprint information of a voice segment using a voiceprint model trained based on a neural network, and thus the extraction process is relatively simple, and a short-time voice segment can be processed in a better manner.
    Type: Grant
    Filed: June 13, 2017
    Date of Patent: January 8, 2019
    Assignee: Alibaba Group Holding Limited
    Inventor: Shaofei Xue
  • Patent number: 10170127
    Abstract: A method and apparatus for transmitting multimedia data are provided. A first device, which provides an audio signal to a second device, includes: a control unit that divides an audio signal input to the first device into a plurality of audio frames, compares a current audio frame among the plurality of audio frames with the at least one previous audio frame prestored in the memory of the first device, and selects one of the prestored previous audio frames based on similarity of the prestored previous audio frame and the current audio frame; and a communication unit that transmits an identification value of the selected previous audio frame to the second device.
    Type: Grant
    Filed: July 7, 2015
    Date of Patent: January 1, 2019
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kyung-hun Jung, Eun-mi Oh, Jong-hoon Jeong, Seon-ho Hwang
  • Patent number: 10146767
    Abstract: Automatic text skimming using lexical chains may be provided. First, at least one lexical chain may be created from an electronic document. Next, a list of positions within the electronic document may be created. The positions may include where at least one concept represented by one of the at least one lexical chain is mentioned. In addition, a list of the position where the at least one concept is mentioned may be assembled. A selection of at least one concept may be received from the list.
    Type: Grant
    Filed: January 28, 2014
    Date of Patent: December 4, 2018
    Assignee: Skimcast Holdings, LLC
    Inventor: William A. Hollingsworth
  • Patent number: 10134383
    Abstract: Systems, methods, and computer-readable storage media for intelligent caching of concatenative speech units for use in speech synthesis. A system configured to practice the method can identify, in a local cache of text-to-speech units for a text-to-speech voice an absent text-to-speech unit which is not in the local cache. The system can request from a server the absent text-to-speech unit. The system can then synthesize speech using the text-to-speech units and a received text-to-speech unit from the server.
    Type: Grant
    Filed: September 8, 2017
    Date of Patent: November 20, 2018
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Benjamin J. Stern, Mark Charles Beutnagel, Alistair D. Conkie, Horst J. Schroeter, Amanda Joy Stent
  • Patent number: 10127231
    Abstract: Disclosed herein are systems, methods, and computer readable-media for rich media annotation, the method comprising receiving a first recorded media content, receiving at least one audio annotation about the first recorded media, extracting metadata from the at least one of audio annotation, and associating all or part of the metadata with the first recorded media content. Additional data elements may also be associated with the first recorded media content. Where the audio annotation is a telephone conversation, the recorded media content may be captured via the telephone. The recorded media content, audio annotations, and/or metadata may be stored in a central repository which may be modifiable. Speech characteristics such as prosody may be analyzed to extract additional metadata. In one aspect, a specially trained grammar identifies and recognizes metadata.
    Type: Grant
    Filed: July 22, 2008
    Date of Patent: November 13, 2018
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Paul Gausman, David C. Gibbon
  • Patent number: 10127901
    Abstract: The technology relates to converting text to speech utilizing recurrent neural networks (RNNs). The recurrent neural networks may be implemented as multiple modules for determining properties of the text. In embodiments, a part-of-speech RNN module, letter-to-sound RNN module, a linguistic prosody tagger RNN module, and a context awareness and semantic mining RNN module may all be utilized. The properties from the RNN modules are processed by a hyper-structure RNN module that determine the phonetic properties of the input text based on the outputs of the other RNN modules. The hyper-structure RNN module may generate a generation sequence that is capable of being converting to audible speech by a speech synthesizer. The generation sequence may also be optimized by a global optimization module prior to being synthesized into audible speech.
    Type: Grant
    Filed: June 13, 2014
    Date of Patent: November 13, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Pei Zhao, Max Leung, Kaisheng Yao, Bo Yan, Sheng Zhao, Fileno A. Alleva
  • Patent number: 10104452
    Abstract: Methods and systems for providing information to a user are described. Multiple mobile devices can individually collect data and feed the data to beacons in a location. The information can include sound data, light data, motion data, health and wellness related data, humidity data, and/or temperature data. The beacons receive this data from multiple users and transmit it to a service provider. The service provider collects the data from the beacons and provides the data to one or more users.
    Type: Grant
    Filed: May 8, 2014
    Date of Patent: October 16, 2018
    Assignee: PAYPAL, INC.
    Inventor: Michael Charles Todasco
  • Patent number: 10095686
    Abstract: Real-time topic analysis for social listening is performed to help users and organizations in discovering and understanding trending topics in varying degrees of granularity. A density-based sampling method is employed to reduce data input. A lightweight NLP method is utilized for topic extraction which provides an efficient mechanism for handling dynamically-changing content. In embodiments, the social analytics system further helps users understand the topics by ranking topics by relevance, labeling topic categories, and grouping semantically-similar topics.
    Type: Grant
    Filed: April 6, 2015
    Date of Patent: October 9, 2018
    Assignee: ADOBE SYSTEMS INCORPORATED
    Inventors: Lei Zhang, Paul Jones, Kent Aaron Otis, Jonathan Gale, Evelyn Chan
  • Patent number: 10090002
    Abstract: Mechanisms, in a natural language processing (NLP) system are provided. The NLP system receives a plurality of communications associated with a communication system, over a predetermined time period, from a plurality of end user devices. The NLP system identifies, for each communication in the plurality of communications, a user submitting the communication to thereby generate a set of users comprising a plurality of users associated with the plurality of communications. The NLP system retrieves a user model for each user in the set of users, which specifies at least one attribute of a corresponding user. The NLP system generates an aggregate user model that aggregates the at least one attribute of each user in the set of users together to generate an aggregate representation of the attributes of the plurality of users in the set of users. The NLP system performs a cognitive operation based on the aggregate user model.
    Type: Grant
    Filed: December 11, 2014
    Date of Patent: October 2, 2018
    Assignee: International Business Machines Corporation
    Inventors: Corville O. Allen, Laura J. Rodriguez
  • Patent number: 10089974
    Abstract: An example text-to-speech learning system performs a method for generating a pronunciation sequence conversion model. The method includes generating a first pronunciation sequence from a speech input of a training pair and generating a second pronunciation sequence from a text input of the training pair. The method also includes determining a pronunciation sequence difference between the first pronunciation sequence and the second pronunciation sequence; and generating a pronunciation sequence conversion model based on the pronunciation sequence difference. An example speech recognition learning system performs a method for generating a pronunciation sequence conversion model. The method includes extracting an audio signal vector from a speech input and applying an audio signal conversion model to the audio signal vector to generate a converted audio signal vector. The method also includes adapting an acoustic model based on the converted audio signal vector to generate an adapted acoustic model.
    Type: Grant
    Filed: March 31, 2016
    Date of Patent: October 2, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Pei Zhao, Kaisheng Yao, Max Leung, Bo Yan