Detect Speech In Noise Patents (Class 704/233)
  • Patent number: 10354660
    Abstract: An endpoint device receives a sequence of audio frames. The endpoint device determines for each audio frame a respective importance level among possible importance levels ranging from a low importance level to a high importance level based on content in the audio frame indicative of the respective importance level. The endpoint device associates each audio frame with the respective importance level, to produce different subsets of audio frames associated with respective ones of different importance levels. The endpoint device, for each subset of audio frames, applies forward error correction to a fraction of audio frames in the subset of audio frames, wherein the fraction increases as the importance level of the audio frames in the subset increases, and does not apply forward error correction to remaining audio frames in the subset.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: July 16, 2019
    Assignee: Cisco Technology, Inc.
    Inventors: Ahmed Badr, Ashish J. Khisti, Wai-tian Tan, Michael A. Ramalho, John G. Apostolopoulos
  • Patent number: 10343287
    Abstract: A robot voice direction-seeking turning system and method. The robot voice direction-seeking turning system employs a voice activity detection unit (1) that detects a received voice signal to determine whether or not a voice signal transmitted by a user (S2) is present; a direction-seeking angle for the voice signal is calculated by a voice direction-seeking unit (3), and a voice direction-seeking turning unit (4) is employed to drive a robot to turn towards the direction of the sound source of the voice signal on the basis of a direction-seeking angle (S4). Employment of the robot voice direction-seeking turning method allows accurate acquisition of a valid voice signal transmitted by the user, thus increasing signal-to-noise ratio and the accuracy of voice recognition.
    Type: Grant
    Filed: June 14, 2016
    Date of Patent: July 9, 2019
    Assignee: YUTOU TECHNOLOGY (HANGZHOU) CO., LTD.
    Inventors: Xin Liu, Peng Gao, Lichun Fan, Jiaqi Shi, Peng Cai, Mingjun Cai
  • Patent number: 10347273
    Abstract: A speech processing apparatus includes: an expectation value calculation unit configured to calculate, using an input signal spectrum and a speech model that models a feature quantity of speech, a spectrum expectation value which is an expectation value of a spectrum of an acoustic component included in the input signal spectrum; and an acoustic power estimation unit configured to estimate an acoustic power of the acoustic component of the input signal spectrum based on the input signal spectrum and the spectrum expectation value.
    Type: Grant
    Filed: December 8, 2015
    Date of Patent: July 9, 2019
    Assignee: NEC CORPORATION
    Inventors: Shuji Komeiji, Masanori Tsujikawa, Ryosuke Isotani
  • Patent number: 10331403
    Abstract: An audio input system includes an audio input apparatus and a plurality of electronic devices. The audio input apparatus includes an audio input unit, a start instruction accepting unit, and an instruction transmitter. The plurality of electronic devices includes a first electronic device. The first electronic device is a target of a process execution instruction based on the audio input by the audio input unit among the instructions. The instruction transmitter transmits a process reducing instruction for reducing an execution of a process as the instruction to the plurality of electronic devices, when an input sound volume of a microphone exceeds a specific sound volume. The audio input unit starts the audio input when the input sound volume of the microphone is equal to or less than the specific sound volume. The instruction transmitter transmits the process execution instruction to the first electronic device.
    Type: Grant
    Filed: February 28, 2018
    Date of Patent: June 25, 2019
    Assignee: Kyocera Document Solutions Inc.
    Inventor: Kazuki Dozen
  • Patent number: 10325612
    Abstract: A method and apparatus that filters audio data received from a speaking person that includes a specific filter for that speaker. The audio characteristics of the speaker's voice may be collected and the specific filter may be formed to reduce noise while also enhancing voice quality. For instance, if a speaker's voice does not contain specific frequencies, then a filter may cancel the noise at such frequencies to ease noise cancellation and reduce processing sound spectrum for cleaning that is not needed. Additionally, the strength frequencies of a speaker's voice may be identified from the collected audio characteristics and those spectrums can be filtered with finer granularity to provide a speaker specific filter that enhances the voice quality of the speaker's voice data that is transmitted or output by a communication device. The audio data may also be output based upon a user's predefined hearing spectrum.
    Type: Grant
    Filed: August 1, 2017
    Date of Patent: June 18, 2019
    Assignee: Unify GmbH & Co. KG
    Inventors: Bizhan Karimi-Cherkandi, Farrokh Mohammadzadeh Kouchri, Schah Walli Ali
  • Patent number: 10319377
    Abstract: A method and system is provided for estimating clean speech parameters from noisy speech parameters. The method is performed by acquiring speech signals, estimating noise from the acquired speech signals, computing speech features from the acquired speech signals, estimating model parameters from the computed speech features and estimating clean parameters from the estimated noise and the estimated model parameters.
    Type: Grant
    Filed: February 28, 2017
    Date of Patent: June 11, 2019
    Assignee: Tata Consultancy Services Limited
    Inventors: Ashish Panda, Sunil Kumar Kopparapu
  • Patent number: 10319393
    Abstract: When an instruction to start voice input is received from the user, a gain controller acquires, from a gain table which defines a correspondence between vehicle speed ranges and gains, a gain corresponding to a vehicle speed range including the vehicle speed of a vehicle detected by a vehicle speed detector, and sets the acquired gain as the gain of an input amplifier that amplifies an input audio signal output by a microphone. As a gain corresponding to each vehicle speed range, the gain table records a gain of the input amplifier corresponding, in an experimentally determined frequency distribution of peak values in the vehicle speed range, to a maximum frequency in the range of magnitude of voice output as an input audio signal by the microphone and to be input to a speech recognition engine as voice having a magnitude within the input range of the speech recognition engine.
    Type: Grant
    Filed: July 27, 2016
    Date of Patent: June 11, 2019
    Assignee: ALPINE ELECTRONICS, INC.
    Inventors: Hirokazu Suzuki, Toru Marumoto
  • Patent number: 10319373
    Abstract: An information processing device includes a phonetic converting unit, an HMM converting unit, and a searching unit. The phonetic converting unit converts a phonetic symbol sequence into a hidden Markov model (HMM) state sequence in which states of an HMM are aligned. The HMM converting unit converts the HMM state sequence into a score vector sequence indicating the degree of similarity to a specific pronunciation using a similarity matrix defining the similarity between the states of the HMM. The searching unit searches for a path having a better score for the score vector sequence than that of the other paths out of paths included in a search network and outputs a phonetic symbol sequence corresponding to the retrieved path.
    Type: Grant
    Filed: December 23, 2016
    Date of Patent: June 11, 2019
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Manabu Nagao
  • Patent number: 10297251
    Abstract: An automatic speech recognition system for a vehicle includes a controller configured to select an acoustic model from a library of acoustic models based on ambient noise in a cabin of the vehicle and operating parameters of the vehicle. The controller is further configured to apply the selected acoustic model to noisy speech to improve recognition of the speech.
    Type: Grant
    Filed: January 21, 2016
    Date of Patent: May 21, 2019
    Assignee: Ford Global Technologies, LLC
    Inventors: Ali Hassani, Scott Andrew Amman, Francois Charette, Brigitte Frances Mora Richardson, Gintaras Vincent Puskorius, An Ji, Ranjani Rangarajan, John Edward Huber
  • Patent number: 10297283
    Abstract: A computer-readable medium, controller and a method of automatically recording a sound signal is provided. A sound signal is received by the controller from a sound generating device. A frequency of the received sound signal is determined by the controller. When the determined frequency is within a predetermined frequency range, the controller starts recording the received sound signal.
    Type: Grant
    Filed: December 30, 2014
    Date of Patent: May 21, 2019
    Assignee: Gibson Brands, Inc.
    Inventor: Shota Terai
  • Patent number: 10290294
    Abstract: An information handling system includes a processor configured to operate in one of a plurality of power states. An audio circuit measures an ambient audio environment within the information handling system, classifies the measured ambient audio into one of a plurality of categories, and implements a power management policy for the processor in response to the measured ambient audio being classified into the one of the categories.
    Type: Grant
    Filed: November 9, 2017
    Date of Patent: May 14, 2019
    Assignee: Dell Products, LP
    Inventors: Ray V. Kacelenga, Merle J. Wood, III, Travis C. North
  • Patent number: 10284724
    Abstract: According to embodiments of the present invention, various computer implemented methods are provided for generating context sensitive alerts for the purposes of fraud detection and prevention. According to one embodiment, a communication (e.g., a call) is received at (or initiated by) a workstation by a human representative or agent. Verbal (voice) communication between the agent and the third party customer is monitored and converted into text. The agent's activity in applications executing in the workstation is also tracked. The combination of converted text and the activity of the agent is evaluated in a behavior engine to detect inconsistencies between the activity of the agent and authorized or typical activity. When inconsistencies are detected, security actions are performed to alert administrators of potential fraud and, in extreme cases, further action by the agent may be prevented in the workstation to prevent additional fraud.
    Type: Grant
    Filed: April 5, 2017
    Date of Patent: May 7, 2019
    Assignee: TELEPERFORMANCE SE
    Inventor: Lyle Hardy
  • Patent number: 10276155
    Abstract: A user media device may include a microphone array and a communication interface. The microphone array may include an omnidirectional microphone and a directional microphone. The microphone array may be selectively switchable. The communication interface may communicatively couple the user media device with a computer and may transmit audio captured by the microphone array to the computer for transfer to a remote service. The remote service may generate text of the processed audio via natural language processing. The remote service may further perform semantic reasoning of the processed audio via a semantic reasoning engine. The remote service may also generate content based at least in part on the semantic reasoning performed on the processed audio. The curated content may include a report having results of the semantic reasoning organized to demonstrate the results in a meaningful way with respect to the processed audio.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: April 30, 2019
    Assignee: FUJITSU LIMITED
    Inventor: James Montantes
  • Patent number: 10264352
    Abstract: Techniques herein provide wireless energy transfer to audio devices such as headphones, headsets, hearing aids, and the like. Audio devices are integrated with a device resonator. The device resonator may be positioned and oriented to reduce interaction with lossy or sensitive components of the audio device. A repeater resonator and/or a source resonator is integrated into a headrest of a seat or a chair providing continuous power to the headphones while in use. The audio devices may be recharged wirelessly when positioned near source resonators that may be embedded in pads, tables, carrying cases, cups, and the like.
    Type: Grant
    Filed: January 9, 2017
    Date of Patent: April 16, 2019
    Assignee: WiTricity Corporation
    Inventors: Steven J. Ganem, Hiroshi A. Mendoza, Morris P. Kesler, Konrad J. Kulikowski, Andre B. Kurs, Alexander P. McCauley, Eric R. Giler, Katherine L. Hall, Gozde Guckaya
  • Patent number: 10255487
    Abstract: A speech determiner determines whether or not a target individual is speaking when facial images of the target individual are captured. An emotion estimator estimates the emotion of the target individual using the facial images of the target individual, on the basis of the determination results of the speech determiner.
    Type: Grant
    Filed: September 26, 2016
    Date of Patent: April 9, 2019
    Assignee: CASIO COMPUTER CO., LTD.
    Inventors: Takashi Yamaya, Kouichi Nakagome, Katsuhiko Satoh
  • Patent number: 10250975
    Abstract: A hands-free audio device has a receive path and a transmit path, which may operate at different audio sampling rates. The transmit path has an interference suppressor that receives a reference signal from the receive path and that suppresses interference in microphone signals received from a microphone array. The interference suppressor is followed in the transmit path by a multi-channel adaptive beamformer that produce a plurality of directional audio signals. A beam selector is configured to select one of the directional audio signals based on voice activity, echo detection, and signal energy.
    Type: Grant
    Filed: October 6, 2017
    Date of Patent: April 2, 2019
    Assignee: Amazon Technologies, Inc.
    Inventor: Jun Yang
  • Patent number: 10249299
    Abstract: Techniques for tailoring beamforming techniques to environments such that processing resources may be devoted to a portion of an audio signal corresponding to a lobe of a beampattern that is most likely to contain user speech. The techniques take into account both acoustic characteristics of an environment and heuristics regarding lobes that have previously been found to include user speech.
    Type: Grant
    Filed: May 1, 2017
    Date of Patent: April 2, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Gregory Michael Hart, Kavitha Velusamy, William Spencer Worley, III
  • Patent number: 10250960
    Abstract: A sound reproduction device includes a signal processing chain configured to render an acoustic useful signal for reproduction to a listener, a simulation scenario processor configured to provide auditory scenario information for a simulated auditory scenario, the simulated auditory scenario influencing perception, by the listener, of the reproduction of the useful signal and/or defining a useful signal type, a user interface configured to detect reproduction parameter settings from a user which represent an individual preference of the listener in view of the simulated auditory scenario, a signal modifier configured to receive the reproduction parameter settings and modify reproduction of the useful signal in dependence on the reproduction parameter settings, and a storage provided for storing the reproduction parameter setting and the auditory scenario information relative to one another.
    Type: Grant
    Filed: May 12, 2016
    Date of Patent: April 2, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Jens Ekkehart Appell, Jan Rennies-Hochmuth
  • Patent number: 10244113
    Abstract: Methods and apparatuses are described for determining customer service quality through digitized voice characteristic measurement and filtering. A voice analysis module captures a first digitized voice segment corresponding to speech submitted by a user of a remote device. The voice analysis module extracts a first set of voice features from the first voice segment, and determines an emotion level of the user based upon the first set of voice features. The voice analysis module captures a second digitized voice segment corresponding to speech submitted by the user. The voice analysis module extracts a second set of voice features from the second voice segment, and determines a change in the emotion level of the user by comparing the first set of voice features to the second set of voice features. The module normalizes the change in the emotion level of the user using emotion influence factors, and generates a service score.
    Type: Grant
    Filed: April 26, 2016
    Date of Patent: March 26, 2019
    Assignee: FMR LLC
    Inventors: Jason Kao, Xinxin Sheng, Bahram Omidfar, Erkang Zheng
  • Patent number: 10242677
    Abstract: Various implementations disclosed herein include a training module configured to determining a set of detection normalization threshold values associated with speaker dependent voiced sound pattern (VSP) detection. In some implementations, a method includes obtaining segment templates characterizing a concurrent segmentation of a first subset of a plurality of vocalization instances of a VSP, each segment template provides a stochastic characterization of how a particular portion of the VSP is vocalized by a particular speaker; generating a noisy segment matrix using a second subset of the plurality of vocalization instances of the VSP, wherein the noisy segment matrix includes one or more noisy copies of segment representations of the second subset; scoring segments from the noisy segment matrix against the segment templates; and determining detection normalization threshold values at two or more known SNR levels for at least one particular noise type based on a function of the scoring.
    Type: Grant
    Filed: August 25, 2015
    Date of Patent: March 26, 2019
    Assignee: MALASPINA LABS (BARBADOS), INC.
    Inventor: Alexander Escott
  • Patent number: 10235128
    Abstract: An embodiments of a contextual sound apparatus may include a sound identifier to identify a sound, a context identifier to identify a context, and an action identifier communicatively coupled to the sound identifier and the context identifier to identify an action based on the identified sound and the identified context. Other embodiments are disclosed and claimed.
    Type: Grant
    Filed: May 19, 2017
    Date of Patent: March 19, 2019
    Assignee: Intel Corporation
    Inventors: Robert L. Vaughn, James B. Eynard
  • Patent number: 10235995
    Abstract: A user media device may include a microphone array and a communication interface. The microphone array may include an omnidirectional microphone and a directional microphone. The microphone array may be selectively switchable. The communication interface may communicatively couple the user media device with a computer and may transmit audio captured by the microphone array to the computer for transfer to a remote service. The remote service may generate text of the processed audio via natural language processing. The remote service may further perform semantic reasoning of the processed audio via a semantic reasoning engine. The remote service may also generate content based at least in part on the semantic reasoning performed on the processed audio. The curated content may include a report having results of the semantic reasoning organized to demonstrate the results in a meaningful way with respect to the processed audio.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: March 19, 2019
    Assignee: FUJITSU LIMITED
    Inventor: James Montantes
  • Patent number: 10229686
    Abstract: Methods and apparatus to process microphone signals by a speech enhancement module to generate an audio stream signal including first and second metadata for use by a speech recognition module. In an embodiment, speech recognition is performed using endpointing information including transitioning from a silence state to a maybe speech state, in which data is buffered, based on the first metadata and transitioning to a speech state, in which speech recognition is performed, based upon the second metadata.
    Type: Grant
    Filed: August 18, 2014
    Date of Patent: March 12, 2019
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Markus Buck, Tobias Herbig, Simon Graf, Christophe Ris
  • Patent number: 10231070
    Abstract: A voice input exception determining method, an apparatus, a terminal, and a storage medium are provided. The method is applied to an electronic device including an audio collection module, and includes determining whether an amplitude value of an audio signal collected by the audio collection module is less than a preset amplitude threshold and/or whether energy distribution of the audio signal meets a preset condition; and if the amplitude value of the audio signal is less than the preset amplitude threshold and/or the energy distribution of the audio signal does not meet the preset condition, determining that voice input of the electronic device is abnormal. A solution provided in the present disclosure resolves a problem that there is no effective method for determining a sound reception exception caused when a sound reception hole of the electronic device is blocked.
    Type: Grant
    Filed: April 29, 2016
    Date of Patent: March 12, 2019
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Lin Yang, Zhaoyang Yin, Jingwen Yang
  • Patent number: 10222447
    Abstract: A method for locating a sound source by maximizing a directed response strength calculated for a plurality of vectors of the interauricular time differences forming a set comprises: a first subset of vectors compatible with sound signals from a single sound source at an unlimited distance from the microphones; and a second subset of vectors not compatible with sound signals from a signal sound source at an unlimited distance from the microphones. Each vector of the first subset is associated with a direction for locating the corresponding single sound source, and each vector of the second subset is associated with the locating direction of a vector of the first subset closest thereto according to a predefined metric. A humanoid robot including: a set of at least three microphones, arranged on a surface higher than the head of thereof; and a processor for implementing one such method is provided.
    Type: Grant
    Filed: September 29, 2014
    Date of Patent: March 5, 2019
    Assignee: SOFTBANK ROBOTICS EUROPE
    Inventor: Grégory Rump
  • Patent number: 10217477
    Abstract: An electronic device and a speech recognition method that is capable of adjusting an end-of-utterance detection period dynamically are disclosed. The electronic device includes a microphone, a display, an input device formed as a part of the display or connected to the electronic device as a separate device, a processor electrically connected to the microphone, the display, and the input device, and a memory electrically connected to the processor. The memory stores instructions, executable by the processor, for receiving an utterance input by a user through the microphone, converting the utterance to text comprised of a series of words or phrases with spaces, displaying the text on the display, the text comprising at least one space formed at an incorrect position, and receiving a user input for updating a predetermined time period through the input device.
    Type: Grant
    Filed: October 31, 2016
    Date of Patent: February 26, 2019
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sungwoon Jang, Sangwook Shin, Sungwan Youn
  • Patent number: 10210857
    Abstract: A method of controlling an audio system comprises: receiving an audio signal, and applying a first gain to the audio signal and outputting an amplified audio signal. On receiving a user input to increase the first gain applied to the audio signal, if the first gain is at a first threshold value, the method comprises: receiving an ambient noise signal, processing the ambient noise signal with a second gain value and outputting a noise cancellation signal, and changing the second gain value in response to the user input.
    Type: Grant
    Filed: October 18, 2017
    Date of Patent: February 19, 2019
    Assignee: Cirrus Logic, Inc.
    Inventors: Nigel Burgess, Mark Allan Watts, Darren Holding
  • Patent number: 10204619
    Abstract: Methods, systems, and apparatus are described that receive audio data for an utterance. Association data is accessed that indicates associations between data corresponding to uncorrupted audio segments, and data corresponding to corrupted versions of the uncorrupted audio segments, where the associations are determined before receiving the audio data for the utterance. Using the association data and the received audio data for the utterance, data corresponding to at least one uncorrupted audio segment is selected. A transcription of the utterance is determined based on the selected data corresponding to the at least one uncorrupted audio segment.
    Type: Grant
    Filed: February 22, 2016
    Date of Patent: February 12, 2019
    Assignee: Google LLC
    Inventors: Olivier Siohan, Pedro J. Moreno Mengibar
  • Patent number: 10192547
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating an accent source. A system practicing the method collects data associated with customer specific services, generates country-specific or dialect-specific weights for each service in the customer specific services list, generates a summary weight based on an aggregation of the country-specific or dialect-specific weights, and sets an interactive voice response system language model based on the summary weight and the country-specific or dialect-specific weights. The interactive voice response system can also change the user interface based on the interactive voice response system language model. The interactive voice response system can tune a voice recognition algorithm based on the summary weight and the country-specific weights. The interactive voice response system can adjust phoneme matching in the language model based on a possibility that the speaker is using other languages.
    Type: Grant
    Filed: April 21, 2016
    Date of Patent: January 29, 2019
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventor: Nicholas Duffield
  • Patent number: 10194259
    Abstract: Various implementations include wearable audio devices and related methods for controlling such devices. In some particular implementations, a computer-implemented method of controlling a wearable audio device includes: receiving an initiation command to initiate a spatial audio mode; providing a plurality of audio samples corresponding with spatially delineated zones in an array defined relative to a physical position of the wearable audio device, in response to the initiation command, where each audio sample is associated with a source of audio content; receiving a selection command selecting one of the plurality of audio samples; and initiating playback of the source of audio content associated with the selected audio sample.
    Type: Grant
    Filed: February 28, 2018
    Date of Patent: January 29, 2019
    Assignee: BOSE CORPORATION
    Inventors: Keith Dana Martin, Todd Richard Reily, Mark Raymond Blewett, Daniel M. Gauger, Jr.
  • Patent number: 10186263
    Abstract: Speech recognition of a stream of spoken utterances is initiated. Thereafter, a spoken utterance stop event to stop the speech recognition is detected, such as in in relation to the stream. The spoken utterance stop event is other than a pause or cessation in the stream of spoken utterances. In response to the spoken utterance stop event being detected, the speech recognition of the stream of spoken utterances is stopped, while the stream of spoken utterances continues. After stopping the speech recognition of the stream of spoken utterances has been stopped, an action is caused to be performed that corresponds to the spoken utterances from a beginning of the stream through and until the spoken utterance stop event.
    Type: Grant
    Filed: August 30, 2016
    Date of Patent: January 22, 2019
    Assignee: Lenovo Enterprise Solutions (Singapore) PTE. LTD.
    Inventors: Amy Leigh Rose, John Scott Crowe, Gary David Cudak, Jennifer J. Lee-Baron, Nathan J. Peterson, Bryan L. Young
  • Patent number: 10182787
    Abstract: The present invention relates to systems and methods for characterizing at least one anatomical parameter of an upper airway of a patient by analysing spectral properties of an utterance, comprising: a mechanical coupler comprising means for restricting the jaw position of the patient; means for recording an utterance; and processing means for determining at least one anatomical parameter of the upper airway from the recorded utterance and comparing the recorded utterance to a threshold value. In addition the present invention relates to the use of the above mentioned systems as a diagnostics tool for assessing obstructive sleep apnea.
    Type: Grant
    Filed: October 11, 2012
    Date of Patent: January 22, 2019
    Assignee: KONINKLIJKE PHILIPS N.V.
    Inventors: Stijn De Waele, Stefan Winter, Alexander Cornelis Geerlings
  • Patent number: 10178113
    Abstract: Systems, methods, and media for generating sanitized data, sanitizing anomaly detection models, and generating anomaly detection models are provided. In some embodiments, methods for sanitizing anomaly detection models are provided. The methods including: receiving at least one abnormal anomaly detection model from at least one remote location; comparing at least one of the at least one abnormal anomaly detection model to a local normal detection model to produce a common set of features common to both the at least one abnormal anomaly detection model and the local normal detection model; and generating a sanitized normal anomaly detection model by removing the common set of features from the local normal detection model.
    Type: Grant
    Filed: July 13, 2015
    Date of Patent: January 8, 2019
    Assignee: The Trustees of Columbia University in the City of New York
    Inventors: Gabriela F. Ciocarlie, Angelos Stavrou, Salvatore J. Stolfo, Angelos D. Keromytis
  • Patent number: 10134395
    Abstract: Techniques for providing virtual assistants to assist users during a voice communication between the users. For instance, a first user operating a device may establish a voice communication with respective devices of one or more additional users, such as with a device of a second user. For instance, the first user may utilize her device to place a telephone call to the device of the second user. A virtual assistant may also join the call and, upon invocation by a user on the call, may identify voice commands from the call and may perform corresponding tasks for the users in response.
    Type: Grant
    Filed: September 25, 2013
    Date of Patent: November 20, 2018
    Assignee: Amazon Technologies, Inc.
    Inventor: Marcello Typrin
  • Patent number: 10123221
    Abstract: A communication device can be configured to estimate a Reference-Signal-Received-Power (RSRP). The communication device can include a transceiver and a controller. The transceiver can be configured to downsample a received signal having a plurality of reference signal resource elements to generate a downsampled signal. The downsampling can alias a first of the plurality of reference signal resource elements into a second of the plurality of reference signal resource elements to generate an aliased reference signal resource element. The transceiver can also be configured to extract the aliased reference signal resource element from the downsampled signal. The controller can be connected to the transceiver and be configured to estimate the RSRP based on the extracted aliased reference signal resource element.
    Type: Grant
    Filed: September 23, 2016
    Date of Patent: November 6, 2018
    Assignee: Intel IP Corporation
    Inventors: Matthew Hayes, Denis Markovic, Viswanath Vajepeyazula
  • Patent number: 10115019
    Abstract: A video may be categorized into a picture category or a video category. A key frame of the video includes a face and a face feature in the key frame is obtained. Face features respectively associated with a plurality of picture categories are acquired and the video is assigned to one of the picture categories based on a comparison of the key frame face feature and the face features of the picture categories. Videos may first be associated with a video category by comparing key frame face features from the videos, and then the video category may be assigned to a picture category based on comparison of a video category face feature with a plurality of picture category face features. Alternatively, a video may be assigned to a picture category based on matching capture times and capture locations between the video and a reference picture in the picture category.
    Type: Grant
    Filed: August 19, 2016
    Date of Patent: October 30, 2018
    Assignee: Xiaomi Inc.
    Inventors: Zhijun Chen, Wendi Hou, Fei Long
  • Patent number: 10109277
    Abstract: Methods and apparatus for using visual information to facilitate a speech recognition process. The method comprises dividing received audio information into a plurality of audio frames, determining for each of the plurality of audio frames, whether the audio information in the audio frame comprises speech from the foreground speaker, wherein the determining is based, at least in part, on received visual information, and transmitting the audio frame to an automatic speech recognition (ASR) engine for speech recognition when it is determined that the audio frame comprises speech from the foreground speaker.
    Type: Grant
    Filed: April 27, 2015
    Date of Patent: October 23, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Etienne Marcheret, Josef Vopicka, Vaibhava Goel
  • Patent number: 10102850
    Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.
    Type: Grant
    Filed: February 25, 2013
    Date of Patent: October 16, 2018
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Kenneth John Basye, Jeffrey Penrod Adams
  • Patent number: 10090005
    Abstract: According to some embodiments, an analog processing portion may receive an audio signal from a microphone. The analog processing portion may then convert the audio signal into sub-band signals and estimate an energy statistic value, such as a Signal-to-Noise Ratio (“SNR”) value, for each sub-band signal. A classification element may classify the estimated energy statistic values with analog processing such that a wakeup signal is generated when voice activity is detected. The wakeup signal may be associated with, for example, a battery-powered, always-listening audio application.
    Type: Grant
    Filed: March 10, 2017
    Date of Patent: October 2, 2018
    Assignee: ASPINITY, INC.
    Inventors: Brandon David Rumberg, David W. Graham
  • Patent number: 10088893
    Abstract: According to some embodiments, a sensor network may be provided with re-programmable and/or reconfigurable analog circuitry configured to monitor data collected by the sensor network. The re-programmable and/or reconfigurable analog circuitry may also generate a wakeup signal in response to a defined wakeup event detected by the sensor network.
    Type: Grant
    Filed: May 12, 2017
    Date of Patent: October 2, 2018
    Inventors: Vinod Kulathumani, David W. Graham, Brandon David Rumberg
  • Patent number: 10083687
    Abstract: Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information based on a previously recorded time and speed of the caller, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model.
    Type: Grant
    Filed: October 16, 2017
    Date of Patent: September 25, 2018
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventor: Mazin Gilbert
  • Patent number: 10084475
    Abstract: An improved mixed oscillator-and-external excitation model and methods for estimating the model parameters, for evaluating model quality, and for combining it with known in the art methods are disclosed. The improvement over existing oscillators allows the model to receive, as an input, all except the most recent point in the acquired data. Model stability is achieved through a process which includes restoring unavailable to the decoder data from the optimal model parameters and by using metrics to select a stable restored model output. The present invention is effective for very low bit-rate coding/compression and decoding/decompression of digital signals, including digitized speech, audio, and image data, and for analysis, detection, and classification of signals. Operations can be performed in real time, and parameterization can be achieved at a user-specified level of compression.
    Type: Grant
    Filed: October 28, 2011
    Date of Patent: September 25, 2018
    Inventors: Irina Gorodnitsky, Anton Yen
  • Patent number: 10083710
    Abstract: A voice control system including a voice receiving unit, an image capturing unit, a storage unit and a control unit is disclosed. The voice receiving unit receives a voice. The image capturing unit captures a video image stream including several human face images. The storage unit stores the voice and the video image stream. The control unit is electrically connected to the voice receiving unit, the image capturing unit and the storage unit. The control unit detects a feature of a human face from the human face images, defines a mouth motion detection region from the feature of the human face, and generates a control signal according to a variation of the mouth motion detection region and a variation of the voice over time. A voice control method, a computer program product and a computer readable medium are also disclosed.
    Type: Grant
    Filed: May 17, 2016
    Date of Patent: September 25, 2018
    Assignee: BXB Electronics Co., Ltd.
    Inventors: Kai-Sheng Chiou, Chih-Lin Hung, Chung-Nan Lee, Chao-Wen Wu
  • Patent number: 10078690
    Abstract: It is provided a method for triggering an action on a second device. It comprises the steps of obtaining audio of a multimedia content presented on a first device; comparing the obtained audio with reference audio data in a database; if finding the obtained audio exists in the database containing reference audio, determining an action corresponding to the matched reference audio; and triggering the action in the second device.
    Type: Grant
    Filed: December 31, 2011
    Date of Patent: September 18, 2018
    Assignee: Thomson Licensing DTV
    Inventors: Jianfeng Chen, Xiaojun Ma, Zhigang Zhang
  • Patent number: 10074360
    Abstract: This relates to providing an indication of the suitability of an acoustic environment for performing speech recognition. One process can include receiving an audio input and determining a speech recognition suitability based on the audio input. The speech recognition suitability can include a numerical, textual, graphical, or other representation of the suitability of an acoustic environment for performing speech recognition. The process can further include displaying a visual representation of the speech recognition suitability to indicate the likelihood that a spoken user input will be interpreted correctly. This allows a user to determine whether to proceed with the performance of a speech recognition process, or to move to a different location having a better acoustic environment before performing the speech recognition process.
    Type: Grant
    Filed: August 24, 2015
    Date of Patent: September 11, 2018
    Assignee: Apple Inc.
    Inventor: Yoon Kim
  • Patent number: 10074369
    Abstract: Systems, methods, and devices for escalating voice-based interactions via speech-controlled devices are described. Speech-controlled devices capture audio, including wakeword portions and payload portions, for sending to a server to relay messages between speech-controlled devices. In response to determining the occurrence of an escalation event, such as repeated messages between the same two devices, the system may automatically change a mode of a speech-controlled device, such as no longer requiring a wakeword, no longer requiring an indication of a desired recipient, or automatically connecting two speech-controlled devices in a voice-chat mode. In response to determining the occurrence of further escalation events, the system may initiate a real-time call between the speech-controlled devices.
    Type: Grant
    Filed: September 1, 2016
    Date of Patent: September 11, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Christo Frank Devaraj, Manish Kumar Dalmia, Tony Roy Hardie, Ran Mokady, Nick Ciubotariu, Sandra Lemon
  • Patent number: 10049653
    Abstract: A system including an automatic noise canceling (ANC) headphone and a processor. The ANC headphone has a microphone configured to generate a microphone signal and at least two non-zero ANC gain levels. The processor is configured to receive the microphone signal, determine a characteristic of the microphone signal, identify a revised ANC level from the ANC gain levels based on a comparison of the characteristic to at least one threshold, and output a signal corresponding to the revised ANC level. Methods are also disclosed.
    Type: Grant
    Filed: October 14, 2016
    Date of Patent: August 14, 2018
    Assignee: AVNERA CORPORATION
    Inventors: Amit Kumar, Eric Sorensen
  • Patent number: 10032454
    Abstract: Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results.
    Type: Grant
    Filed: June 25, 2015
    Date of Patent: July 24, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Shilei Zhang, Shenghua Bao, Wen Liu, Yong Qin, Zhiwei Shuang, Jian Chen, Zhong Su, Qin Shi, William F. Ganong, III
  • Patent number: 10026399
    Abstract: Architectures and techniques for selecting a voice-enabled device to handle audio input that is detected by multiple voice-enabled devices are described herein. In some instances, multiple voice-enabled devices may detect audio input from a user at substantially the same time, due to the voice-enabled devices being located within proximity to the user. The architectures and techniques may analyze a variety of audio signal metric values for the voice-enabled devices to designate a voice-enabled device to handle the audio input.
    Type: Grant
    Filed: September 11, 2015
    Date of Patent: July 17, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Ramya Gopalan, Shiva Kumar Sundaram
  • Patent number: 10014002
    Abstract: Methods and systems for audio source separation in real-time are described. In an embodiment, the present disclosure describes reading and decoding an audio source into PCM samples, fragmenting Pulse Code Modulation (PCM) samples into fragments, transforming fragments into spectrograms, performing audio source separation using a deep neural network (DNN) to generate an estimated magnitude spectrogram of the component(s) of the audio source, reconstructing the estimated time domain component signals, and streaming the component signals to a playback engine. In an embodiment, a semantic equalizer graphical user allows for real-time mixing of individual component signals.
    Type: Grant
    Filed: October 24, 2017
    Date of Patent: July 3, 2018
    Assignee: Red Pill VR, Inc.
    Inventors: Alejandro Koretzky, Karthiek Reddy Bokka, Naveen Sasalu Rajashekharappa