Patents Examined by Paras D Shah

Automatic detection of preferences for subtitles and dubbing

Patent number: 10321174

Abstract: In an approach, a server computer receives a request from a client computer specifying particular content for a particular user, wherein the particular content is associated with an original audio language. In response to receiving the request, the server computer selects a preferred audio language and a preferred subtitle language for the particular content based on a particular record of a preference database. The server computer returns asset identifying data that the client computer uses to obtain a stream of the particular content using the preferred audio language and the preferred subtitle language from a content delivery network (CDN) or other asset location. The server computer receives a message from the client computer that identifies an presented audio language and a presented subtitle language that were presented to the particular user while the particular content streamed.

Type: Grant

Filed: February 14, 2017

Date of Patent: June 11, 2019

Assignee: NETFLIX, INC.

Inventors: Haley Wilson, Eddy Wu, Ritesh Makharia, Taskin Rahman, Philip Fisher-Ogden
Apparatus and method for improving speech intelligibility in background noise by amplification and compression

Patent number: 10319394

Abstract: An apparatus for generating a modified speech signal from a speech input signal which has a plurality of speech subband signals, the modified speech signal having a plurality of modified subband signals is provided, having: a weighting information generator for generating weighting information for each speech subband signal depending on a signal power of said speech subband signal, and a signal modifier for modifying each speech subband signal by applying the weighting information on said speech subband signal to obtain a modified subband signal.

Type: Grant

Filed: July 8, 2015

Date of Patent: June 11, 2019

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Jan Rennies, Henning Schepker, Simon Doclo, Jens E. Appell
Resampling output signals of QMF based audio codecs

Patent number: 10311886

Abstract: An apparatus for processing an audio signal includes a configurable first audio signal processor for processing the audio signal in accordance with different configuration settings to obtain a processed audio signal, wherein the apparatus is adapted so that different configuration settings result in different sampling rates of the processed audio signal. The apparatus furthermore includes n analysis filter bank having a first number of analysis filter bank channels, a synthesis filter bank having a second number of synthesis filter bank channels, a second audio processor being adapted to receive and process an audio signal having a predetermined sampling rate, and a controller for controlling the first number of analysis filter bank channels or the second number of synthesis filter bank channels in accordance with a configuration setting.

Type: Grant

Filed: February 3, 2017

Date of Patent: June 4, 2019

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Markus Lohwasser, Manuel Jander, Max Neuendorf, Ralf Geiger, Markus Schnell, Matthias Hildenbrand, Tobias Chalupka
Conflict resolution enhancement system

Patent number: 10304447

Abstract: A method for controlling a plurality of environmental factors that trigger a negative emotional state is provided. The method may include analyzing a plurality of user data when a user experiences a plurality of various environmental factors. The method may also include determining an emotional state experienced by the user when each of the plurality of various environmental factors is present based on the plurality of user data. The method may include receiving a plurality of calendar information associated with a user account. The method may also include identifying an upcoming event based on the plurality of calendar information. The method may include identifying an environmental factor within the plurality of various environmental factors is present at the upcoming event. The method may also include, in response to determining the environmental factor causes the user to experience a negative emotional state, executing an accommodation method based on the environmental factor.

Type: Grant

Filed: January 25, 2017

Date of Patent: May 28, 2019

Assignee: International Business Machines Corporation

Inventors: Paul R. Bastide, Matthew E. Broomhall, Robert E. Loredo, Fang Lu
Compatible multi-channel coding/decoding

Patent number: 10299058

Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.

Type: Grant

Filed: December 4, 2018

Date of Patent: May 21, 2019

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
Detecting dangerous expressions based on a theme

Patent number: 10275447

Abstract: Embodiments relate to a type of expression based on a particular theme. An aspect includes acquiring, by an electronic apparatus, from text data for learning, a subset of the text data associated with the particular theme and with particular time period information. Another aspect includes extracting text data containing negative information from the acquired subset of the text data. Another aspect includes extracting a word or phrase having a high correlation with the extracted text data or a word or phrase having a high appearance frequency in the extracted text data from the extracted text data. Yet another aspect includes determining that the extracted word or phrase is the type of expression based on the particular theme.

Type: Grant

Filed: December 12, 2016

Date of Patent: April 30, 2019

Assignee: International Business Machines Corporation

Inventors: Emiko Takeuchi, Daisuke Takuma, Hirobumi Toyoshima
Methods, systems, and media for presenting interactive audio content

Patent number: 10255916

Abstract: Methods, systems, and media for presenting interactive audio content are provided.

Type: Grant

Filed: February 27, 2017

Date of Patent: April 9, 2019

Assignee: PBJ Synthetics Corporation

Inventor: Peter Zetterberg
Voice conversion system and method with variance and spectrum compensation

Patent number: 10249314

Abstract: A voice conversion system for generating realistic, natural-sounding target speech is disclosed. The voice conversion system preferably comprises a neural network for converting the source speech data to estimated target speech data; a global variance correction module; a modulation spectrum correction module; and a waveform generator. The global variance correction module is configured to scale and shift (or normalize and de-normalize) the estimated target speech based on (i) a mean and standard deviation of the source speech data, and further based on (ii) a mean and standard deviation of the estimated target speech data. The modulation spectrum correction module is configured to apply a plurality of filters to the estimated target speech data after it has been scaled and shifted by the global variance correction module. Each filter is designed to correct the trajectory representing the curve of one MCEP coefficient over time.

Type: Grant

Filed: July 21, 2017

Date of Patent: April 2, 2019

Assignee: OBEN, INC.

Inventor: Sandesh Aryal
Sound processing based on a confidence measure

Patent number: 10249324

Abstract: A method for processing sound that includes, generating one or more noise component estimates relating to an electrical representation of the sound and generating an associated confidence measure for the one or more noise component estimates. The method further comprises processing, based on the confidence measure, the sound.

Type: Grant

Filed: February 7, 2017

Date of Patent: April 2, 2019

Assignee: Cochlear Limited

Inventors: Adam A. Hersbach, Stefan J. Mauger, John M. Heasman, Pam W. Dawson
Compatible multi-channel coding/decoding

Patent number: 10237674

Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.

Type: Grant

Filed: August 14, 2018

Date of Patent: March 19, 2019

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
Methods and apparatus for speech segmentation using multiple metadata

Patent number: 10229686

Abstract: Methods and apparatus to process microphone signals by a speech enhancement module to generate an audio stream signal including first and second metadata for use by a speech recognition module. In an embodiment, speech recognition is performed using endpointing information including transitioning from a silence state to a maybe speech state, in which data is buffered, based on the first metadata and transitioning to a speech state, in which speech recognition is performed, based upon the second metadata.

Type: Grant

Filed: August 18, 2014

Date of Patent: March 12, 2019

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Markus Buck, Tobias Herbig, Simon Graf, Christophe Ris
Natural language user interface for computer-aided design systems

Patent number: 10229679

Abstract: A natural language user interface for computer-aided design systems (CAD) comprises a natural language command module including a parser, language database and a CAD model analyzer, and a natural language server module including a second, increased capability parser, a second, preferably larger language database and a CAD context database. The CAD model analyzer analyzes and retrieves associated CAD model information related to a parsed voice command and the CAD context database provides specific CAD related contextual information to facilitate parsing and interpreting CAD specific commands. The natural language server program module may also include an artificial intelligence based query generator and communicate through a network or cloud with resource providers such as third party market places or suppliers to generate queries for retrieval of third party supplied information necessary to respond to or execute CAD specific voice commands.

Type: Grant

Filed: December 22, 2016

Date of Patent: March 12, 2019

Inventors: James L. Jacobs, II, John E. Cronin, Christopher M. Huffines
Method, apparatus and program capable of outputting response perceivable to a user as natural-sounding

Patent number: 10224021

Abstract: A voice synthesizing apparatus includes: a voice inputter (102) configured to input a voice; an obtainer (22) configured to obtain a primary response to the voice inputted by the voice inputter (102); an analyzer (112) configured to analyze whether the primary response includes a repetition target; and a voice synthesizer (24) configured to, in a case where the analyzed primary response is determined to include the repetition target, synthesize a voice from a secondary response that includes the repetition target repeated at least twice to output the voice.

Type: Grant

Filed: July 2, 2015

Date of Patent: March 5, 2019

Assignee: Yamaha Corporation

Inventor: Hiroaki Matsubara
Video to data

Patent number: 10218954

Abstract: A method and system can generate video content from a video. The method and system can include generating audio files and image files from the video, distributing the audio files and the image files across a plurality of processors and processing the audio files and the image files in parallel. The audio files associated with the video to text and the image files associated with the video to video content can be converted. The text and the video content can be cross-referenced with the video.

Type: Grant

Filed: February 7, 2015

Date of Patent: February 26, 2019

Assignee: CELLULAR SOUTH, INC.

Inventors: Naeem Lakhani, Bartlett Wade Smith, IV, Allison A. Talley
Adjusting a deep neural network acoustic model

Patent number: 10204621

Abstract: A computer-implemented method according to one embodiment includes estimating a speaker dependent acoustic model utilizing test speech data and a hybrid estimation technique, transforming labeled speech data to create transformed speech data, utilizing the speaker dependent acoustic model and a nonlinear transformation, and adjusting a deep neural network (DNN) acoustic model, utilizing the transformed speech data.

Type: Grant

Filed: September 7, 2016

Date of Patent: February 12, 2019

Assignee: International Business Machines Corporation

Inventors: Xiaodong Cui, Vaibhava Goel
Compatible multi-channel coding/decoding

Patent number: 10206054

Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.

Type: Grant

Filed: August 14, 2018

Date of Patent: February 12, 2019

Assignee: Fraunhofer Gesellschaft zur Foerderung der angewandten Forschung e.V

Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
Adjusting a deep neural network acoustic model

Patent number: 10204620

Abstract: A computer-implemented method according to one embodiment includes estimating a speaker dependent acoustic model utilizing test speech data and maximum likelihood linear regression (MLLR), transforming labeled speech data to create transformed speech data, utilizing the speaker dependent acoustic model and a linear transformation, and adjusting a deep neural network (DNN) acoustic model, utilizing the transformed speech data.

Type: Grant

Filed: September 7, 2016

Date of Patent: February 12, 2019

Assignee: International Business Machines Corporation

Inventors: Xiaodong Cui, Vaibhava Goel
Encoder, decoder, coding method, decoding method, coding program, decoding program and recording medium

Patent number: 10199046

Abstract: An encoder includes a periodic-combined-envelope generating part and a variable-length coding part. The periodic-combined-envelope generating part generates a periodic combined envelope sequence which is a frequency-domain sequence based on a spectral envelope sequence which is a frequency-domain sequence corresponding to a linear predictive coefficient code obtained from an input audio signal and on a frequency-domain period. The variable-length coding part encodes a frequency-domain sequence derived from the input audio signal. A decoder includes a periodic-combined-envelope generating part and a variable-length decoding part. The periodic-combined-envelope generating part generates a periodic combined envelope sequence which is a frequency-domain sequence based on a spectral envelope sequence which is a frequency-domain sequence corresponding to a linear predictive coefficient code and on a frequency-domain period.

Type: Grant

Filed: February 20, 2015

Date of Patent: February 5, 2019

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada
Magnitude compensation technique for processing single-bit wide data

Patent number: 10200053

Abstract: Droop caused by a filter may be compensated by applying a pre-filter to the audio signal that cancels out, at least in part, the droop caused by the filter. The pre-filter may implement magnitude compensation that causes an approximately flat passband response when the pre-filtered signal is passed through the filter. The pre-filter may be applied to one-bit wide data streams, such as high-fidelity direct stream digital (DSD) audio data or other one-bit wide data such as pulse-density modulation (PDM) encoded data. The pre-filtering and filtering may be implemented in components of an audio processor, such as in a digital-to-analog converter (DAC). The pre-filtering may include upsampling the one-bit wide data to form symbols and substituting an eighth bit of the symbol with an inverted version of an earlier-received bit.

Type: Grant

Filed: April 14, 2016

Date of Patent: February 5, 2019

Assignee: Cirrus Logic, Inc.

Inventor: Dylan Alexander Hester
System and method for unified normalization in text-to-speech and automatic speech recognition

Patent number: 10199034

Abstract: A system, method and computer-readable storage devices are for using a single set of normalization protocols and a single language lexica (or dictionary) for both TTS and ASR. The system receives input (which is either text to be converted to speech or ASR training text), then normalizes the input. The system produces, using the normalized input and a dictionary configured for both automatic speech recognition and text-to-speech processing, output which is either phonemes corresponding to the input or text corresponding to the input for training the ASR system. When the output is phonemes corresponding to the input, the system generates speech by performing prosody generation and unit selection synthesis using the phonemes. When the output is text corresponding to the input, the system trains both an acoustic model and a language model for use in future speech recognition.

Type: Grant

Filed: August 18, 2014

Date of Patent: February 5, 2019

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Alistair D. Conkie, Ladan Golipour

prev … 9 10 11 12 13 14 15 16 17 … next