Patents Examined by Paras D Shah
  • Patent number: 10321174
    Abstract: In an approach, a server computer receives a request from a client computer specifying particular content for a particular user, wherein the particular content is associated with an original audio language. In response to receiving the request, the server computer selects a preferred audio language and a preferred subtitle language for the particular content based on a particular record of a preference database. The server computer returns asset identifying data that the client computer uses to obtain a stream of the particular content using the preferred audio language and the preferred subtitle language from a content delivery network (CDN) or other asset location. The server computer receives a message from the client computer that identifies an presented audio language and a presented subtitle language that were presented to the particular user while the particular content streamed.
    Type: Grant
    Filed: February 14, 2017
    Date of Patent: June 11, 2019
    Assignee: NETFLIX, INC.
    Inventors: Haley Wilson, Eddy Wu, Ritesh Makharia, Taskin Rahman, Philip Fisher-Ogden
  • Patent number: 10319394
    Abstract: An apparatus for generating a modified speech signal from a speech input signal which has a plurality of speech subband signals, the modified speech signal having a plurality of modified subband signals is provided, having: a weighting information generator for generating weighting information for each speech subband signal depending on a signal power of said speech subband signal, and a signal modifier for modifying each speech subband signal by applying the weighting information on said speech subband signal to obtain a modified subband signal.
    Type: Grant
    Filed: July 8, 2015
    Date of Patent: June 11, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Jan Rennies, Henning Schepker, Simon Doclo, Jens E. Appell
  • Patent number: 10311886
    Abstract: An apparatus for processing an audio signal includes a configurable first audio signal processor for processing the audio signal in accordance with different configuration settings to obtain a processed audio signal, wherein the apparatus is adapted so that different configuration settings result in different sampling rates of the processed audio signal. The apparatus furthermore includes n analysis filter bank having a first number of analysis filter bank channels, a synthesis filter bank having a second number of synthesis filter bank channels, a second audio processor being adapted to receive and process an audio signal having a predetermined sampling rate, and a controller for controlling the first number of analysis filter bank channels or the second number of synthesis filter bank channels in accordance with a configuration setting.
    Type: Grant
    Filed: February 3, 2017
    Date of Patent: June 4, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Markus Lohwasser, Manuel Jander, Max Neuendorf, Ralf Geiger, Markus Schnell, Matthias Hildenbrand, Tobias Chalupka
  • Patent number: 10304447
    Abstract: A method for controlling a plurality of environmental factors that trigger a negative emotional state is provided. The method may include analyzing a plurality of user data when a user experiences a plurality of various environmental factors. The method may also include determining an emotional state experienced by the user when each of the plurality of various environmental factors is present based on the plurality of user data. The method may include receiving a plurality of calendar information associated with a user account. The method may also include identifying an upcoming event based on the plurality of calendar information. The method may include identifying an environmental factor within the plurality of various environmental factors is present at the upcoming event. The method may also include, in response to determining the environmental factor causes the user to experience a negative emotional state, executing an accommodation method based on the environmental factor.
    Type: Grant
    Filed: January 25, 2017
    Date of Patent: May 28, 2019
    Assignee: International Business Machines Corporation
    Inventors: Paul R. Bastide, Matthew E. Broomhall, Robert E. Loredo, Fang Lu
  • Patent number: 10299058
    Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.
    Type: Grant
    Filed: December 4, 2018
    Date of Patent: May 21, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
  • Patent number: 10275447
    Abstract: Embodiments relate to a type of expression based on a particular theme. An aspect includes acquiring, by an electronic apparatus, from text data for learning, a subset of the text data associated with the particular theme and with particular time period information. Another aspect includes extracting text data containing negative information from the acquired subset of the text data. Another aspect includes extracting a word or phrase having a high correlation with the extracted text data or a word or phrase having a high appearance frequency in the extracted text data from the extracted text data. Yet another aspect includes determining that the extracted word or phrase is the type of expression based on the particular theme.
    Type: Grant
    Filed: December 12, 2016
    Date of Patent: April 30, 2019
    Assignee: International Business Machines Corporation
    Inventors: Emiko Takeuchi, Daisuke Takuma, Hirobumi Toyoshima
  • Patent number: 10255916
    Abstract: Methods, systems, and media for presenting interactive audio content are provided.
    Type: Grant
    Filed: February 27, 2017
    Date of Patent: April 9, 2019
    Assignee: PBJ Synthetics Corporation
    Inventor: Peter Zetterberg
  • Patent number: 10249314
    Abstract: A voice conversion system for generating realistic, natural-sounding target speech is disclosed. The voice conversion system preferably comprises a neural network for converting the source speech data to estimated target speech data; a global variance correction module; a modulation spectrum correction module; and a waveform generator. The global variance correction module is configured to scale and shift (or normalize and de-normalize) the estimated target speech based on (i) a mean and standard deviation of the source speech data, and further based on (ii) a mean and standard deviation of the estimated target speech data. The modulation spectrum correction module is configured to apply a plurality of filters to the estimated target speech data after it has been scaled and shifted by the global variance correction module. Each filter is designed to correct the trajectory representing the curve of one MCEP coefficient over time.
    Type: Grant
    Filed: July 21, 2017
    Date of Patent: April 2, 2019
    Assignee: OBEN, INC.
    Inventor: Sandesh Aryal
  • Patent number: 10249324
    Abstract: A method for processing sound that includes, generating one or more noise component estimates relating to an electrical representation of the sound and generating an associated confidence measure for the one or more noise component estimates. The method further comprises processing, based on the confidence measure, the sound.
    Type: Grant
    Filed: February 7, 2017
    Date of Patent: April 2, 2019
    Assignee: Cochlear Limited
    Inventors: Adam A. Hersbach, Stefan J. Mauger, John M. Heasman, Pam W. Dawson
  • Patent number: 10237674
    Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.
    Type: Grant
    Filed: August 14, 2018
    Date of Patent: March 19, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
  • Patent number: 10229686
    Abstract: Methods and apparatus to process microphone signals by a speech enhancement module to generate an audio stream signal including first and second metadata for use by a speech recognition module. In an embodiment, speech recognition is performed using endpointing information including transitioning from a silence state to a maybe speech state, in which data is buffered, based on the first metadata and transitioning to a speech state, in which speech recognition is performed, based upon the second metadata.
    Type: Grant
    Filed: August 18, 2014
    Date of Patent: March 12, 2019
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Markus Buck, Tobias Herbig, Simon Graf, Christophe Ris
  • Patent number: 10229679
    Abstract: A natural language user interface for computer-aided design systems (CAD) comprises a natural language command module including a parser, language database and a CAD model analyzer, and a natural language server module including a second, increased capability parser, a second, preferably larger language database and a CAD context database. The CAD model analyzer analyzes and retrieves associated CAD model information related to a parsed voice command and the CAD context database provides specific CAD related contextual information to facilitate parsing and interpreting CAD specific commands. The natural language server program module may also include an artificial intelligence based query generator and communicate through a network or cloud with resource providers such as third party market places or suppliers to generate queries for retrieval of third party supplied information necessary to respond to or execute CAD specific voice commands.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: March 12, 2019
    Inventors: James L. Jacobs, II, John E. Cronin, Christopher M. Huffines
  • Patent number: 10224021
    Abstract: A voice synthesizing apparatus includes: a voice inputter (102) configured to input a voice; an obtainer (22) configured to obtain a primary response to the voice inputted by the voice inputter (102); an analyzer (112) configured to analyze whether the primary response includes a repetition target; and a voice synthesizer (24) configured to, in a case where the analyzed primary response is determined to include the repetition target, synthesize a voice from a secondary response that includes the repetition target repeated at least twice to output the voice.
    Type: Grant
    Filed: July 2, 2015
    Date of Patent: March 5, 2019
    Assignee: Yamaha Corporation
    Inventor: Hiroaki Matsubara
  • Patent number: 10218954
    Abstract: A method and system can generate video content from a video. The method and system can include generating audio files and image files from the video, distributing the audio files and the image files across a plurality of processors and processing the audio files and the image files in parallel. The audio files associated with the video to text and the image files associated with the video to video content can be converted. The text and the video content can be cross-referenced with the video.
    Type: Grant
    Filed: February 7, 2015
    Date of Patent: February 26, 2019
    Assignee: CELLULAR SOUTH, INC.
    Inventors: Naeem Lakhani, Bartlett Wade Smith, IV, Allison A. Talley
  • Patent number: 10204621
    Abstract: A computer-implemented method according to one embodiment includes estimating a speaker dependent acoustic model utilizing test speech data and a hybrid estimation technique, transforming labeled speech data to create transformed speech data, utilizing the speaker dependent acoustic model and a nonlinear transformation, and adjusting a deep neural network (DNN) acoustic model, utilizing the transformed speech data.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: February 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Xiaodong Cui, Vaibhava Goel
  • Patent number: 10206054
    Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.
    Type: Grant
    Filed: August 14, 2018
    Date of Patent: February 12, 2019
    Assignee: Fraunhofer Gesellschaft zur Foerderung der angewandten Forschung e.V
    Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
  • Patent number: 10204620
    Abstract: A computer-implemented method according to one embodiment includes estimating a speaker dependent acoustic model utilizing test speech data and maximum likelihood linear regression (MLLR), transforming labeled speech data to create transformed speech data, utilizing the speaker dependent acoustic model and a linear transformation, and adjusting a deep neural network (DNN) acoustic model, utilizing the transformed speech data.
    Type: Grant
    Filed: September 7, 2016
    Date of Patent: February 12, 2019
    Assignee: International Business Machines Corporation
    Inventors: Xiaodong Cui, Vaibhava Goel
  • Patent number: 10199046
    Abstract: An encoder includes a periodic-combined-envelope generating part and a variable-length coding part. The periodic-combined-envelope generating part generates a periodic combined envelope sequence which is a frequency-domain sequence based on a spectral envelope sequence which is a frequency-domain sequence corresponding to a linear predictive coefficient code obtained from an input audio signal and on a frequency-domain period. The variable-length coding part encodes a frequency-domain sequence derived from the input audio signal. A decoder includes a periodic-combined-envelope generating part and a variable-length decoding part. The periodic-combined-envelope generating part generates a periodic combined envelope sequence which is a frequency-domain sequence based on a spectral envelope sequence which is a frequency-domain sequence corresponding to a linear predictive coefficient code and on a frequency-domain period.
    Type: Grant
    Filed: February 20, 2015
    Date of Patent: February 5, 2019
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada
  • Patent number: 10200053
    Abstract: Droop caused by a filter may be compensated by applying a pre-filter to the audio signal that cancels out, at least in part, the droop caused by the filter. The pre-filter may implement magnitude compensation that causes an approximately flat passband response when the pre-filtered signal is passed through the filter. The pre-filter may be applied to one-bit wide data streams, such as high-fidelity direct stream digital (DSD) audio data or other one-bit wide data such as pulse-density modulation (PDM) encoded data. The pre-filtering and filtering may be implemented in components of an audio processor, such as in a digital-to-analog converter (DAC). The pre-filtering may include upsampling the one-bit wide data to form symbols and substituting an eighth bit of the symbol with an inverted version of an earlier-received bit.
    Type: Grant
    Filed: April 14, 2016
    Date of Patent: February 5, 2019
    Assignee: Cirrus Logic, Inc.
    Inventor: Dylan Alexander Hester
  • Patent number: 10199034
    Abstract: A system, method and computer-readable storage devices are for using a single set of normalization protocols and a single language lexica (or dictionary) for both TTS and ASR. The system receives input (which is either text to be converted to speech or ASR training text), then normalizes the input. The system produces, using the normalized input and a dictionary configured for both automatic speech recognition and text-to-speech processing, output which is either phonemes corresponding to the input or text corresponding to the input for training the ASR system. When the output is phonemes corresponding to the input, the system generates speech by performing prosody generation and unit selection synthesis using the phonemes. When the output is text corresponding to the input, the system trains both an acoustic model and a language model for use in future speech recognition.
    Type: Grant
    Filed: August 18, 2014
    Date of Patent: February 5, 2019
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Alistair D. Conkie, Ladan Golipour