Patents Examined by Stephen Brinich

Enhanced content-based multimedia recommendation method

Patent number: 10255503

Abstract: There is disclosed a method for generating movie recommendations, based on automatic extraction of features from a multimedia content, wherein the extracted features are visual features representing mise-en-scène characteristics of the movie defined on the basis of Applied Media Aesthetic theory, said extracted features being then fed to content-based recommendation algorithm in order to generate personalized recommendation.

Type: Grant

Filed: September 27, 2016

Date of Patent: April 9, 2019

Assignee: POLITECNICO DI MILANO

Inventors: Paolo Cremonesi, Mehdi Elahi, Yashar Deldjoo
Voice activity detection

Patent number: 10229700

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for detecting voice activity. In one aspect, a method include actions of receiving, by a neural network included in an automated voice activity detection system, a raw audio waveform, processing, by the neural network, the raw audio waveform to determine whether the audio waveform includes speech, and provide, by the neural network, a classification of the raw audio waveform indicating whether the raw audio waveform includes speech.

Type: Grant

Filed: January 4, 2016

Date of Patent: March 12, 2019

Assignee: Google LLC

Inventors: Tara N. Sainath, Gabor Simko, Maria Carolina Parada San Martin, Ruben Zazo Candil
Language model customization in speech recognition for speech analytics

Patent number: 10186255

Abstract: A method for generating a language model for an organization includes: receiving, by a processor, organization-specific training data; receiving, by the processor, generic training data; computing, by the processor, a plurality of similarities between the generic training data and the organization-specific training data; assigning, by the processor, a plurality of weights to the generic training data in accordance with the computed similarities; combining, by the processor, the generic training data with the organization-specific training data in accordance with the weights to generate customized training data; training, by the processor, a customized language model using the customized training data; and outputting, by the processor, the customized language model, the customized language model being configured to compute the likelihood of phrases in a medium.

Type: Grant

Filed: August 25, 2016

Date of Patent: January 22, 2019

Inventors: Tamir Tapuhi, Amir Lev-Tov, Avraham Faizakof, Yochai Konig
Robust end-pointing of speech signals using speaker recognition

Patent number: 10186282

Abstract: Systems and processes for robust end-pointing of speech signals using speaker recognition are provided. In one example process, a stream of audio having a spoken user request can be received. A first likelihood that the stream of audio includes user speech can be determined. A second likelihood that the stream of audio includes user speech spoken by an authorized user can be determined. A start-point or an end-point of the spoken user request can be determined based at least in part on the first likelihood and the second likelihood.

Type: Grant

Filed: April 30, 2015

Date of Patent: January 22, 2019

Assignee: Apple Inc.

Inventors: Devang K. Naik, Sachin Kajarekar
Discriminative training of a feature-space transform

Patent number: 10170103

Abstract: A method, a system, and a computer program product are provided for discriminatively training a feature-space transform. The method includes performing feature-space discriminative training (f-DT) on an initialized feature-space transform, using manually transcribed data, to obtain a pre-stage trained feature-space transform. The method further includes performing f-DT on the pre-stage trained feature-space transform as a newly initialized feature-space transform, using automatically transcribed data, to obtain a main-stage trained feature-space transform. The method additionally includes performing f-DT on the main-stage trained feature-space transform as a newly initialized feature-space transform, using manually transcribed data, to obtain a post-stage trained feature-space transform.

Type: Grant

Filed: January 22, 2016

Date of Patent: January 1, 2019

Assignee: International Business Machines Corporation

Inventor: Takashi Fukuda
Sound source identification apparatus and sound source identification method

Patent number: 10127922

Abstract: A sound source identification apparatus includes a sound collection unit including a plurality of microphones, a sound source localization unit configured to localize a sound source on the basis of an acoustic signal collected by the sound collection unit, a sound source separation unit configured to perform separation of the sound source on the basis of the signal localized by the sound source localization unit, and a sound source identification unit configured to perform identification of a type of sound source on the basis of a result of the separation in the sound source separation unit, and a signal input to the sound source identification unit is a signal having a magnitude equal to or greater than a first threshold value which is a predetermined value.

Type: Grant

Filed: August 3, 2016

Date of Patent: November 13, 2018

Assignee: HONDA MOTOR CO., LTD.

Inventors: Kazuhiro Nakadai, Satoshi Uemura
Data mining multilingual and contextual cognates from user profiles

Patent number: 10114817

Abstract: Techniques for identifying multilingual cognates and using the multilingual cognates are provided. In one technique, multilingual cognates identified from multiple user profiles are used to train one or more translation models. In another technique, multilingual cognates identified from a single user's profile are used to translate text provided by that user. In another technique, multilingual cognates from a single user are used to align sentences in one language to sentences in another language and the aligned sentences are used to train a language model. In another technique, multilingual cognates identified from multiple user profiles are used to expand search queries. In another technique, multilingual cognates identified from multiple user profiles are used to translate other users' profiles into a target language so that users associated with a source language are viewing the other users' profiles.

Type: Grant

Filed: August 6, 2015

Date of Patent: October 30, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Bing Zhao, Kin Kan
Medical image processing device, medical imaging device, medical image processing method and medical imaging method

Patent number: 10098604

Abstract: A medical image processing device includes a port, a processor and a display. The port acquires a plurality of image data from a living body. The processor classifies the plurality of image dam to genes ate a plurality of image groups based OH a first time component. The first time component is defined by a first time interval among imaging times at which the plurality of image data are generated. The processor correlates each image data in one image groups with each image data in another image group, based on both an actual time and a time ratio of a second time component. The second time component is defined by a second time interval among the imaging times being shorter than the first time interval. The display displays images based on the plurality of image data based on the correlation of the image data in the image groups.

Type: Grant

Filed: September 27, 2016

Date of Patent: October 16, 2018

Assignee: ZIOSOFT, INC.

Inventors: Kenichiro Yasuhiro, Shinichiro Seo
Method and device for presenting content

Patent number: 10078690

Abstract: It is provided a method for triggering an action on a second device. It comprises the steps of obtaining audio of a multimedia content presented on a first device; comparing the obtained audio with reference audio data in a database; if finding the obtained audio exists in the database containing reference audio, determining an action corresponding to the matched reference audio; and triggering the action in the second device.

Type: Grant

Filed: December 31, 2011

Date of Patent: September 18, 2018

Assignee: Thomson Licensing DTV

Inventors: Jianfeng Chen, Xiaojun Ma, Zhigang Zhang
Updating language databases using crowd-sourced input

Patent number: 10073828

Abstract: Technology is described for refining a language model for a language recognition system based on aggregating and analyzing word tag metadata from multiple users of the language. The technology allows a user to mark a word or phrase in a selected language (e.g., as offensive or misspelled, or as a part of speech or other category), combines information collected from multiple users of the selected language, and updates the user's language model based on the combined information from multiple users of the selected language.

Type: Grant

Filed: February 27, 2015

Date of Patent: September 11, 2018

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Christopher Breske, Ethan Bradford, David Field, Wendy Bannister
System and method for performing speech enhancement using a deep neural network-based signal

Patent number: 10074380

Abstract: Method for performing speech enhancement using a Deep Neural Network (DNN)-based signal starts with training DNN offline by exciting a microphone using target training signal that includes signal approximation of clean speech. Loudspeaker is driven with a reference signal and outputs loudspeaker signal. Microphone then generates microphone signal based on at least one of: near-end speaker signal, ambient noise signal, or loudspeaker signal. Acoustic-echo-canceller (AEC) generates AEC echo-cancelled signal based on reference signal and microphone signal. Loudspeaker signal estimator generates estimated loudspeaker signal based on microphone signal and AEC echo-cancelled signal. DNN receives microphone signal, reference signal, AEC echo-cancelled signal, and estimated loudspeaker signal and generates a speech reference signal that includes signal statistics for residual echo or for noise.

Type: Grant

Filed: August 3, 2016

Date of Patent: September 11, 2018

Assignee: Apple Inc.

Inventors: Jason Wung, Ramin Pishehvar, Daniele Giacobello, Joshua D. Atkins
Voice band detection and implementation

Patent number: 10062394

Abstract: A system encourages experimentation with audio frequency and speaker technologies while causing an inanimate object to appear to lip-sync. The system applies a bandpass filter to an incoming audio stream to determine a magnitude of audio content in a frequency band of interest. For example, the system may filter results directed at the voice band, associated with speech. A controller controls a strobe light to flash at a particular point of travel of a platform reciprocating at a known frequency. An illusion is created that a sculpture, such as a piece of paper formed into a ring, is lip-synching to music.

Type: Grant

Filed: March 31, 2015

Date of Patent: August 28, 2018

Assignee: BOSE CORPORATION

Inventor: Lee Zamir
Covariance matrix estimation with structural-based priors for speech processing

Patent number: 10056076

Abstract: According to some embodiments of the present invention there is provided a computerized method for speech processing using a Gaussian Mixture Model. The method comprises the action of receiving by hardware processor(s) two or more covariance values representing relationships between distributions of speech coefficient values that represent two or more audible input speech signals recorded by a microphone. The method comprises the action of computing two or more eigenvectors and eignevalues using a principle component analysis of the covariance values and transforming the speech coefficient values using the eigenvectors and computing two or more second covariance values from the transformed speech coefficient values. The method comprises the action of modifying some of the second covariance values according to the eignevalues, the covariance values, and two or more indices of the speech coefficient values. The second covariance values to the speech processor comprising the Gaussian Mixture Model.

Type: Grant

Filed: September 6, 2015

Date of Patent: August 21, 2018

Assignee: International Business Machines Corporation

Inventor: Hagai Aronowitz
Speech-controlled actions based on keywords and context thereof

Patent number: 10019992

Abstract: A device includes a plurality of components, a memory having a keyword recognition module and a context recognition module, a microphone configured to receive an input speech spoken by a user, an analog-to-digital converter configured to convert the input speech from an analog form to a digital form and generate a digitized speech, and a processor. The processor is configured to detect, using the keyword recognition module, a keyword in the digitized speech, initiate, in response to detecting the keyword by the keyword recognition module, an action to be taken one of the plurality of components, wherein the keyword is associated with the action, determine, using the context recognition module, a context for the keyword, and execute the action if the context determined by the context recognition module indicates that the keyword is a command.

Type: Grant

Filed: June 29, 2015

Date of Patent: July 10, 2018

Assignee: Disney Enterprises, Inc.

Inventors: Jill Fain Lehman, Samer Al Moubayed
System and methods for voice-controlled seat adjustment

Patent number: 10007478

Abstract: A control system for a vehicle having a seat with a first moveable portion and an adjustment actuator coupled with the first moveable seat portion includes a voice input device, a touchscreen input device, and a controller in communication with the adjustment actuator, the voice input device, and the touchscreen input device. The controller has a processor programmed to interpret a first adjustment command from one of a voice command received from the voice input device and a manual command received from the touchscreen input device, carry out a first seat adjustment by causing the adjustment actuator to move the first moveable seat portion according to the first adjustment command, and to present information related to the first adjustment command on the touchscreen.

Type: Grant

Filed: June 26, 2015

Date of Patent: June 26, 2018

Assignee: Ford Global Technologies, LLC

Inventors: Tony Wang, Frank Wu, Rose Tong
Motion correction in a projection domain in time of flight positron emission tomography

Patent number: 9990741

Abstract: Motion correction is performed in time-of-flight (TOF) positron emission tomography (PET). Rather than applying motion correction to reconstructed images or as part of reconstruction, the motion correction is applied in the projection domain of the PET data. The TOF data from the PET scan is altered to account for the motion. The TOF data is altered prior to starting reconstruction. The motion in the patient or image domain is forward projected to provide motion in the projection domain of the TOF data. The projected motion of different phases is applied to the TOF data from different phases, respectively, to create a combined dataset of motion corrected TOF data representing the patient at a reference phase. The dataset is larger (e.g., similar size from projection data dimension point of view, but contains more counts per projection data unit or is more dense) than available at one phase of the physiological cycle and is then used in reconstruction.

Type: Grant

Filed: September 19, 2016

Date of Patent: June 5, 2018

Assignee: Siemens Medical Solutions USA, Inc.

Inventor: Vladimir Y. Panin
Insertion of characters in speech recognition

Patent number: 9978370

Abstract: One embodiment provides a method, including: receiving, from an audio capture device, speech input; converting, using a processor, the speech input to machine text; receiving, from an alternate input source, an input comprising at least one character; identifying, using a processor, a location associated with the machine text to insert the at least one character; and inserting, using a processor, the at least one character at the location identified. Other aspects are described and claimed.

Type: Grant

Filed: July 31, 2015

Date of Patent: May 22, 2018

Assignee: Lenovo (Singapore) Pte. Ltd.

Inventors: Song Wang, Jianbang Zhang, Ming Qian, Jian Li
Keyword detection using speaker-independent keyword models for user-designated keywords

Patent number: 9959863

Abstract: A method, which is performed by an electronic device, for obtaining a speaker-independent keyword model of a keyword designated by a user is disclosed. The method may include receiving at least one sample sound from the user indicative of the keyword. The method may also generate a speaker-dependent keyword model for the keyword based on the at least one sample sound, send a request for the speaker-independent keyword model of the keyword to a server in response to generating the speaker-dependent keyword model, and receive the speaker-independent keyword model adapted for detecting the keyword spoken by a plurality of users from the server.

Type: Grant

Filed: September 8, 2014

Date of Patent: May 1, 2018

Assignee: QUALCOMM Incorporated

Inventors: Minsub Lee, Taesu Kim, Sungrack Yun
Method and system for speech detection

Patent number: 9916846

Abstract: A system and method for determining an amount of speech in an audio signal may include for example: obtaining segments of the audio signal, wherein the segments are grouped into blocks; for each one of the segments, calculating a segment value indicative of an amplitude of the audio signal of a respective segment; for each one of the blocks calculating a block value indicative of the amplitude of the audio signal of a respective block; and calculating an audio signal speech grade based on segment values and block values, wherein the audio signal speech grade is indicative of the amount of speech in the audio signal.

Type: Grant

Filed: February 10, 2015

Date of Patent: March 13, 2018

Assignee: NICE LTD.

Inventors: Frits Lassche, Ivar Meijer, Victor Bastiaan Mosch, Steven St. John Logan, Jurgen Willem Wessel, Gerardus B. J. Stam
Controlling electronic device based on direction of speech

Patent number: 9911416

Abstract: A method for controlling an electronic device in response to speech spoken by a user is disclosed. The method may include receiving an input sound by a sound sensor. The method may also detect the speech spoken by the user in the input sound, determine first characteristics of a first frequency range and second characteristics of a second frequency range of the speech in response to detecting the speech in the input sound, and determine whether a direction of departure of the speech spoken by the user is toward the electronic device based on the first and second characteristics.

Type: Grant

Filed: March 27, 2015

Date of Patent: March 6, 2018

Assignee: QUALCOMM Incorporated

Inventors: Sungrack Yun, Taesu Kim, Duck Hoon Kim, Kyuwoong Hwang

1 2 3 4 5 … next