Detect Speech In Noise Patents (Class 704/233)
  • Patent number: 10735861
    Abstract: An apparatus for reducing cross-talk between transmitted audio signals and received audio in a headset. The headset includes one or more of a set of earphones, a headset frame, a microphone boom with an array of MEMS microphone configured to isolate the earphone audio from the microphone audio, a VOX circuit, low crosstalk cable(s), and/or other components. Sets of microphones may be enabled and/or disabled to reduce cross-talk between received audio signals and transmitted audio signals. The VOX circuit is configured to reduce cross-talk between received audio signals and transmitted audio signals.
    Type: Grant
    Filed: February 6, 2019
    Date of Patent: August 4, 2020
    Assignee: HM Electronics, Inc.
    Inventors: Charles Butten, Karl Knoblock, Robert Snyder
  • Patent number: 10725523
    Abstract: Examples disclosed herein provide the ability for a computing device to determine a noise threshold to wake on ambient noises. In one example method, the computing device tracks sound, detected by a microphone of the computing device, over a period of time and, based on the sound tracked over the period of time, determines a noise threshold. The computing device tunes a sensitivity of the microphone to wake the computing device when ambient noise, detected by the microphone, is to have a signal strength equal to or exceeding the noise threshold.
    Type: Grant
    Filed: April 11, 2016
    Date of Patent: July 28, 2020
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Alexander Wayne Clark, Kent E Biggs, Richard E Hodges
  • Patent number: 10705620
    Abstract: There is provided a signal processing apparatus including: a control unit that executes, on a basis of a waveform signal generated in accordance with a motion of an attachment portion of a sensor attached to a tool or a body, effect processing for the waveform signal or another waveform signal, the waveform signal being output from the sensor. The signal processing apparatus performs presentation so that a body motion itself can be aurally felt.
    Type: Grant
    Filed: October 6, 2016
    Date of Patent: July 7, 2020
    Assignee: SONY CCORPORATION
    Inventors: Heesoon Kim, Masaharu Yoshino, Masahiko Inami, Kouta Minamizawa, Yuta Sugiura, Yusuke Mizushina, Tatsushi Nashida
  • Patent number: 10681450
    Abstract: A wireless earpiece includes a wireless earpiece housing, at least one microphone for detecting ambient environment sound, and a processor disposed within the wireless earpiece housing, the processor configured to distinguish between two or more sources of sound within the ambient environment sound. The wireless earpiece further includes a user interface operatively connected to the processor. The processor is configured to receive user input through the user interface to select one of the sources of sound within the ambient environment sound and wherein the processor is configured to process the ambient environment sound to emphasize portions of the ambient environment sound generated by the one of the sources of the ambient environment sound selected by the user to produce a modified sound. The earpiece may further include a speaker operatively connected to the processor to reproduce the modified sound.
    Type: Grant
    Filed: July 30, 2018
    Date of Patent: June 9, 2020
    Assignee: BRAGI GmbH
    Inventors: Peter Vincent Boesen, Darko Dragicevic
  • Patent number: 10666791
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for evaluating the quality of a communication session. One of the methods includes identifying, by a communication system, a communication session between one or more users of the communication system, wherein, during the communication session, session data is routed between a first communications device of a first user of the communication system and one or more other communications devices along a communication path; obtaining, from each of a plurality of communication nodes along the communication path, quality data relating to a quality of the communication session at the communication node; generating, using the quality data, a model input to a quality score machine learning model; and providing the model input as input to the quality score machine learning model to generate the estimated quality score for at least the portion of the communication session.
    Type: Grant
    Filed: March 12, 2019
    Date of Patent: May 26, 2020
    Assignee: RingCentral, Inc.
    Inventors: Kira Makagon, Helen Prask, Yuri Ardulov, Igor Rusinov, Ivan Gennadevich Anisimov
  • Patent number: 10657960
    Abstract: A dialog content is generated using information that is unique to a user and information that is not unique. The processing executed by a dialog system includes a step of identifying a person based on a dialog with a user, a step of acquiring personal information, a step of analyzing the dialog, a step of extracting an event, a step of searching for a local episode and a global episode based on the personal information and the event, a step of generating dialog data using the search result, a step of outputting a dialog, and a step of accepting user evaluation.
    Type: Grant
    Filed: July 22, 2016
    Date of Patent: May 19, 2020
    Assignee: SHARP KABUSHIKI KAISHA
    Inventors: Rei Tokunaga, Toru Ueda
  • Patent number: 10643614
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for designating certain voice commands as hotwords. The methods, systems, and apparatus include actions of receiving a hotword followed by a voice command. Additional actions include determining that the voice command satisfies one or more predetermined criteria associated with designating the voice command as a hotword, where a voice command that is designated as a hotword is treated as a voice input regardless of whether the voice command is preceded by another hotword. Further actions include, in response to determining that the voice command satisfies one or more predetermined criteria associated with designating the voice command as a hotword, designating the voice command as a hotword.
    Type: Grant
    Filed: December 10, 2018
    Date of Patent: May 5, 2020
    Assignee: Google LLC
    Inventor: Matthew Sharifi
  • Patent number: 10621980
    Abstract: Performing speech recognition in a multi-device system includes receiving a first audio signal that is generated by a first microphone in response to a verbal utterance, and a second audio signal that is generated by a second microphone in response to the verbal utterance; dividing the first audio signal into a first sequence of temporal segments; dividing the second audio signal into a second sequence of temporal segments; comparing a sound energy level associated with a first temporal segment of the first sequence to a sound energy level associated with a first temporal segment of the second sequence; based on the comparing, selecting, as a first temporal segment of a speech recognition audio signal, one of the first temporal segment of the first sequence and the first temporal segment of the second sequence; and performing speech recognition on the speech recognition audio signal.
    Type: Grant
    Filed: March 21, 2017
    Date of Patent: April 14, 2020
    Assignee: Harman International Industries, Inc.
    Inventor: Seon Man Kim
  • Patent number: 10607597
    Abstract: A speech signal recognition method, apparatus, and system. The speech signal recognition method may include obtaining by or from a terminal an output of a personalization layer, with respect to a speech signal provided by a user of the terminal, having been implemented by input of the speech signal to the personalization layer, the personalization layer being previously trained based on speech features of the user, implementing a global model by providing the obtained output of the personalization layer to the global model, the global model being configured to output a phonemic signal indicating a phoneme included in the speech signal through the global model being previously trained based on speech features common to a plurality of users, and re-training the personalization layer based on the phonemic signal output from the global model, where the personalization layer and the global model collectively represent an acoustic model.
    Type: Grant
    Filed: March 9, 2018
    Date of Patent: March 31, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Minyoung Mun, SangHyun Yoo, Young Sang Choi, Ki Soo Kwon, Hodong Lee
  • Patent number: 10607600
    Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.
    Type: Grant
    Filed: February 12, 2018
    Date of Patent: March 31, 2020
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Sarangarajan Parthasarathy, Richard Cameron Rose
  • Patent number: 10602387
    Abstract: A second device that is in communication with a first device receives transmissions of the first device and detects a SILENCE period status of the first device, which corresponds to a status wherein the first device has no speech samples to be transmitted towards the second apparatus. The second device determines the type of the received transmissions, counts the number of received transmissions of a first type, and times a time interval between the last received transmission of the first type and the last received transmission of a second determined type. At reception of a transmission of the first type, the second device detects whether the first device is in the SILENCE period status on an evaluation of the counted number of transmissions of the first type and the time interval of the last received transmission of the first type and the last received transmission of the second type.
    Type: Grant
    Filed: July 12, 2018
    Date of Patent: March 24, 2020
    Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)
    Inventors: Carola Faronius, Saad Naveed Ahmed, Don Corry
  • Patent number: 10593317
    Abstract: A road noise cancellation (RNC) system may include a controller and attenuator for reducing the audibility of the noise floor caused by the system's vibration sensors. A level of anti-noise at a location in a passenger cabin that may be attributed to the sensor noise floor may be estimated. An actual sound level in the passenger cabin may be measured or estimated, with the sensor noise floor component algorithmically removed. The difference in levels may be compared to a predetermined threshold to determine an amount of attenuation, if any, to be applied to an anti-noise signal to reduce audibility.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: March 17, 2020
    Assignee: HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED
    Inventors: Kevin J. Bastyr, James May
  • Patent number: 10586557
    Abstract: According to one aspect, a method for determining voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having a sample rate, and spitting the audio signal into a plurality of subbands, the plurality of subbands including at least a lowest subband and a highest subband. The method further comprises filtering the lowest subband to reduce an energy of the lowest subband, estimating a noise level for at least some of the plurality of subbands, and computing a signal-to-noise ratio for at least some of the plurality of subbands. The method also includes determining a speech activity level based at least in part on the computed signal-to-noise ratios and an average of an energy of at least some of the plurality of subbands.
    Type: Grant
    Filed: July 19, 2019
    Date of Patent: March 10, 2020
    Assignee: Dolby Laboratories Licensing Corporation
    Inventor: Hannes Muesch
  • Patent number: 10573304
    Abstract: The present disclosure relates to speech recognition systems and methods using an adaptive incremental learning approach. More specifically, the present disclosure relates to adaptive incremental learning in a self-taught vocal user interface.
    Type: Grant
    Filed: November 4, 2015
    Date of Patent: February 25, 2020
    Assignee: KATHOLIEKE UNIVERSITEIT LEUVEN
    Inventors: Jort Gemmeke, Bart Ons, Hugo Van Hamme
  • Patent number: 10573314
    Abstract: Systems and methods are disclosed. A digitized human vocal expression of a user and digital images are received over a network from a remote device. The digitized human vocal expression is processed to determine characteristics of the human vocal expression, including: pitch, volume, rapidity, a magnitude spectrum identify, and/or pauses in speech. Digital images are received and processed to detect characteristics of the user face, including detecting if one or more of the following is present: a sagging lip, a crooked smile, uneven eyebrows, and/or facial droop. Based at least on part on the human vocal expression characteristics and face characteristics, a determination is made as to what action is to be taken. A cepstrum pitch may be determined using an inverse Fourier transform of a logarithm of a spectrum of a human vocal expression signal. The volume may be determined using peak heights in a power spectrum of the human vocal expression.
    Type: Grant
    Filed: February 27, 2019
    Date of Patent: February 25, 2020
    Inventor: Karen Elaine Khaleghi
  • Patent number: 10566012
    Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.
    Type: Grant
    Filed: October 12, 2018
    Date of Patent: February 18, 2020
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Kenneth John Basye, Jeffrey Penrod Adams
  • Patent number: 10564925
    Abstract: Many headsets include automatic noise cancellation (ANC) which dramatically reduces perceived background noise and improves user listening experience. Unfortunately, the voice microphones in these devices often capture ambient noise that the headsets output during phone calls or other communication sessions to other users. In response, many headsets and communication devices provide manual muting circuitry, but users frequently forget to turn the muting on and/or off, creating further problems as they communicate. To address this, the present inventors devised, among other things, an exemplary headset that detects the absence or presence of user speech, automatically muting and unmuting the voice microphone without user intervention. Some embodiments leverage relationships between feedback and feedforward signals in ANC circuitry to detect user speech, avoiding the addition of extra hardware to the headset.
    Type: Grant
    Filed: September 21, 2017
    Date of Patent: February 18, 2020
    Inventors: Jiajin An, Michael Jon Wurtz, David Wurtz, Manpreet Khaira, Amit Kumar, Shawn O'Connor, Shankar Rathoud, James Scanlan, Eric Sorensen
  • Patent number: 10555133
    Abstract: A method includes receiving, by sensors inside an enclosure of a vehicle, signals generated by signal generators in the enclosure of the vehicle. One of the sensors or signal generators may be part of a mobile device inside the enclosure. The method also includes determining a location and orientation of the mobile device from the signals. The method further includes determining, based on the location and orientation of the mobile device, an object in the enclosure that the mobile device is pointing to. The mobile device further includes transmitting a message to the mobile device in response to determining that the mobile device is pointing to the object, so as to cause the mobile device to display a user interface to allow the mobile device to control the object.
    Type: Grant
    Filed: September 14, 2017
    Date of Patent: February 4, 2020
    Assignee: Apple Inc.
    Inventors: Sawyer I. Cohen, Jack J. Wanderman, Romain A. Teil, Scott M. Herz
  • Patent number: 10540969
    Abstract: A purpose of the present invention is to provide a technique for easily performing accurate voice recognition.
    Type: Grant
    Filed: July 21, 2016
    Date of Patent: January 21, 2020
    Assignee: Clarion Co., Ltd.
    Inventors: Takashi Yamaguchi, Yasushi Nagai
  • Patent number: 10535340
    Abstract: Audio information defining audio content may be accessed. The audio content may have a duration. The audio content may be segmented into audio segments. Individual audio segments may correspond to a portion of the duration. Feature vectors of the audio segments may be determined. The feature vectors may be processed through a classifier. The classifier may output scores on whether the audio segments contain voice. One or more of the audio segments may be identified as containing voice based on the scores and a two-step hysteresis thresholding. Storage of the identification of the one or more of the audio segments as containing voice in one or more storage media may be effectuated.
    Type: Grant
    Filed: August 15, 2019
    Date of Patent: January 14, 2020
    Assignee: GoPro, Inc.
    Inventor: Gabriel Lema
  • Patent number: 10529348
    Abstract: An apparatus for generating an enhanced signal from an input signal, wherein the enhanced signal has spectral values for an enhancement spectral region, the spectral values for the enhancement spectral regions not being contained in the input signal, includes a mapper for mapping a source spectral region of the input signal to a target region in the enhancement spectral region, the source spectral region including a noise-filling region; and a noise filler configured for generating first noise values for the noise-filling region in the source spectral region of the input signal and for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from the first noise values or for generating second noise values for a noise region in the target region, wherein the second noise values are decorrelated from first noise values in the source region.
    Type: Grant
    Filed: January 24, 2017
    Date of Patent: January 7, 2020
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Sascha Disch, Ralf Geiger, Andreas Niedermeier, Matthias Neusinger, Konstantin Schmidt, Stephan Wilde, Benjamin Schubert, Christian Neukam
  • Patent number: 10529358
    Abstract: A method for reducing noise to a user to enable a conversation-of-interest to be heard, the noise originating from a noise source, the method comprising the steps of: operating at least one first device located at a first distance from the noise source, the user having noise-cancellation earphones connected to a second mobile device, the second mobile device located at a second distance from the noise source, the first distance less than the second distance; prehearing noise from the noise source using the at least one first device; analyzing the preheard noise to yield a respective analyzed noise signal; and processing the respective analyzed noise signal to effect noise cancellation for the noise-cancellation earphones.
    Type: Grant
    Filed: February 23, 2018
    Date of Patent: January 7, 2020
    Inventor: Shmuel Ur
  • Patent number: 10506990
    Abstract: Aspects of the subject matter described in this disclosure can be implemented in a fall detection device and method. One or more motion sensors can access a user's acceleration data. The acceleration data can be segmented using a segmentation algorithm to identify a potential fall event. The segmentation algorithm can determine a cumulative sum of the acceleration data, where the cumulative sum is based on acceleration values being greater than or less than an acceleration threshold value, and a potential fall event can be identified where the cumulative sum is greater than a cumulative sum threshold value. Statistical features can be extracted from the segmented acceleration data and aggregated, and a determination can be made as to whether the potential fall event is a fall event based at least in part on the statistical features.
    Type: Grant
    Filed: September 9, 2016
    Date of Patent: December 17, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Jin Won Lee, Xinzhou Wu, Rashid Ahmed Akbar Attar, Feng Han
  • Patent number: 10489452
    Abstract: It is provided a method for triggering an action on a second device. It comprises the steps of obtaining audio of a multimedia content presented on a first device; comparing the obtained audio with reference audio data in a database; if finding the obtained audio exists in the database containing reference audio, determining an action corresponding to the matched reference audio; and triggering the action in the second device.
    Type: Grant
    Filed: September 11, 2018
    Date of Patent: November 26, 2019
    Assignee: InterDigital Madison Patent Holdings, SAS
    Inventors: Jianfeng Chen, Xiaojun Ma, Zhigang Zhang
  • Patent number: 10482877
    Abstract: Examples described herein include systems, methods, and devices for transmitting a media signal to the remote sensor, receiving a sound signal from the remote sensor, and monitoring the sound signal and the media signal to recognize voice commands.
    Type: Grant
    Filed: August 28, 2015
    Date of Patent: November 19, 2019
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: David H. Hanes
  • Patent number: 10477403
    Abstract: A system and method for monitoring telephone calls to detect fraudulent activity and take corrective action is described. The system receives a group of telephone calls having associated call characteristics and analyzes the group of telephone calls to identify and store a first set of distributions of call characteristics that are indicative of normal activity, fraudulent activity, or indeterminate activity. The system receives one or more subsequent telephone calls to be analyzed. The system analyzes the received one or more telephone calls to identify a second set of distributions of call characteristics associated with the received telephone call. The system then compares the second set of distributions of call characteristics to the stored first set of distributions of call characteristics to assess a probability that the one or more received telephone calls represents normal, fraudulent, or indeterminate activity.
    Type: Grant
    Filed: October 22, 2018
    Date of Patent: November 12, 2019
    Assignee: Marchex, Inc.
    Inventors: Jason Flaks, Ziad Ismail
  • Patent number: 10469936
    Abstract: A method, system, and apparatus for noise cancelation is disclosed herein, which may be used in a wireless unit (WU) and a headset removably connected to the WU. The WU may include a processor, a memory, a user interface, internal microphones and internal speakers. The headset may include microphones and speakers. The WU may receive a first ambient noise from one or more headset microphones, which may generate a first signal based on the first ambient noise. The WU may calculate an estimate of ambient noise based on the first signal, calculate an a signal for noise cancellation based on the estimate of ambient noise, cancel estimated ambient noise from an audio output signal based on the application of the signal for noise cancellation, and send the audio output signal to the speakers of the headset or the speakers of the WU.
    Type: Grant
    Filed: May 7, 2018
    Date of Patent: November 5, 2019
    Inventor: Fatih Mehmet Ozluturk
  • Patent number: 10468032
    Abstract: Techniques related to speaker recognition are discussed. Such techniques include determining context aware confidence values formed of false accept and false reject rates determined by using adaptively updated acoustic environment score distributions matched to current score distributions.
    Type: Grant
    Filed: April 10, 2017
    Date of Patent: November 5, 2019
    Assignee: Intel Corporation
    Inventors: Jonathan J. Huang, Gokcen Cilingir, Tobias Bocklet
  • Patent number: 10453457
    Abstract: A method and device for performing voice control on a device with a microphone array are disclosed. The method includes the following steps. It is confirmed that the device is in an audio playing state. An interference sound interfering the device in the audio playing state is analyzed. A voice enhancement mode adopted by the device is selected according to a feature of the interference sound. A user's voice is detected in real time for a wake-up word, and when the wake-up word is detected, the device is controlled to stop audio playing. An interference sound interfering the device after playing audios is stopped is analyzed, and the voice enhancement mode adopted by the device is adjusted according to a feature of the interference sound. A command word from a user is acquired to control the device to execute a corresponding function, to respond to the user.
    Type: Grant
    Filed: December 20, 2017
    Date of Patent: October 22, 2019
    Assignee: Beijing Xiaoniao Tingting Technology, Co., Ltd.
    Inventors: Bo Li, Shasha Lou
  • Patent number: 10453443
    Abstract: This relates to providing an indication of the suitability of an acoustic environment for performing speech recognition. One process can include receiving an audio input and determining a speech recognition suitability based on the audio input. The speech recognition suitability can include a numerical, textual, graphical, or other representation of the suitability of an acoustic environment for performing speech recognition. The process can further include displaying a visual representation of the speech recognition suitability to indicate the likelihood that a spoken user input will be interpreted correctly. This allows a user to determine whether to proceed with the performance of a speech recognition process, or to move to a different location having a better acoustic environment before performing the speech recognition process.
    Type: Grant
    Filed: August 22, 2018
    Date of Patent: October 22, 2019
    Assignee: Apple Inc.
    Inventor: Yoon Kim
  • Patent number: 10455319
    Abstract: A method, a system, and a computer program product reducing noise in audio received by at least one microphone. The method includes determining, from an audio signal received by at least one primary microphone of an electronic device, whether a user that is proximate to the electronic device is currently speaking. The method further includes, in response to determining that a user is not currently speaking, receiving a first audio using a first microphone subset from among a plurality of microphones and receiving at least one second audio using at least one second microphone subset from among the plurality of microphones. The method further includes generating a composite signal from the first audio and the second audio. The method further includes collectively processing the audio signal and the composite signal to generate a modified audio signal having a reduced level of noise.
    Type: Grant
    Filed: July 18, 2018
    Date of Patent: October 22, 2019
    Assignee: Motorola Mobility LLC
    Inventors: Jincheng Wu, Joel A. Clark, Malay Gupta, Plamen A. Ivanov
  • Patent number: 10446140
    Abstract: Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information based on a previously recorded speed of the caller, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: October 15, 2019
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventor: Mazin Gilbert
  • Patent number: 10446167
    Abstract: Systems, methods, and devices for user-specific noise suppression are provided. For example, when a voice-related feature of an electronic device is in use, the electronic device may receive an audio signal that includes a user voice. Since noise, such as ambient sounds, also may be received by the electronic device at this time, the electronic device may suppress such noise in the audio signal. In particular, the electronic device may suppress the noise in the audio signal while substantially preserving the user voice via user-specific noise suppression parameters. These user-specific noise suppression parameters may be based at least in part on a user noise suppression preference or a user voice profile, or a combination thereof.
    Type: Grant
    Filed: January 27, 2014
    Date of Patent: October 15, 2019
    Assignee: Apple Inc.
    Inventors: Aram Lindahl, Baptiste Pierre Paquier
  • Patent number: 10430708
    Abstract: In some embodiments, noise data may be used to train a neural network (or other prediction model). In some embodiments, input noise data may be obtained and provided to a prediction model to obtain an output related to the input noise data (e.g., the output being a prediction related to the input noise data). One or more target output indications may be provided as reference feedback to the prediction model to update one or more portions of the prediction model, wherein the one or more portions of the prediction model are updated based on the related output and the target indications. Subsequent to the portions of the prediction model being updated, a data item may be provided to the prediction model to obtain a prediction related to the data item (e.g., a different version of the data item, a location of an aspect in the data item, etc.).
    Type: Grant
    Filed: October 26, 2018
    Date of Patent: October 1, 2019
    Assignee: AIVITAE LLC
    Inventors: Bob Sueh-chien Hu, Joseph Yitang Cheng
  • Patent number: 10431213
    Abstract: The technology described in this document can be embodied in a computer-implemented method that includes receiving, at a processing system, a first signal including an output of a speaker device and an additional audio signal. The method also includes determining, by the processing system, based at least in part on a model trained to identify the output of the speaker device, that the additional audio signal corresponds to an utterance of a user. The method further includes initiating a reduction in an audio output level of the speaker device based on determining that the additional audio signal corresponds to the utterance of the user.
    Type: Grant
    Filed: February 2, 2018
    Date of Patent: October 1, 2019
    Assignee: Google LLC
    Inventors: Diego Melendo Casado, Ignacio Lopez Moreno, Javier Gonzalez-Dominguez
  • Patent number: 10424294
    Abstract: Audio information defining audio content may be accessed. The audio content may have a duration. The audio content may be segmented into audio segments. Individual audio segments may correspond to a portion of the duration. Feature vectors of the audio segments may be determined. The feature vectors may be processed through a classifier. The classifier may output scores on whether the audio segments contain voice. One or more of the audio segments may be identified as containing voice based on the scores and a two-step hysteresis thresholding. Storage of the identification of the one or more of the audio segments as containing voice in one or more storage media may be effectuated.
    Type: Grant
    Filed: January 3, 2018
    Date of Patent: September 24, 2019
    Assignee: GoPro, Inc.
    Inventor: Gabriel Lema
  • Patent number: 10418052
    Abstract: According to one aspect, a method for detecting voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having an sample rate; dividing the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband; filtering the lowest subband with a moving average filter to reduce an energy of the lowest subband; estimating a noise level for each of the plurality of subbands; calculating a signal to noise ratio value for each of the plurality of subbands; and determining a speech activity level of the frame based on an average of the calculated signal to noise ratio values and a weighted average of an energy of each of the plurality of subbands. Other aspects include audio decoders that decode audio that was encoded using the methods described herein.
    Type: Grant
    Filed: October 12, 2017
    Date of Patent: September 17, 2019
    Assignee: Dolby Laboratories Licensing Corporation
    Inventor: Hannes Muesch
  • Patent number: 10403303
    Abstract: Audio content may have a duration. The audio content may be segmented into audio segments. Individual audio segments may correspond to a portion of the duration. Mel frequency spectral power features, Mel frequency cepstral coefficient features, and energy features of the audio segments may be determined. Feature vectors of the audio segments may be determined based on the Mel frequency spectral power features, the Mel frequency cepstral coefficient features, and the energy features. The feature vectors may be processed through a support vector machine. The support vector machine may output predictions on whether the audio segments contain speech. One or more of the audio segments may be identified as containing speech based on filtering the predictions and comparing the filtered predictions to a threshold. Storage of the identification of the one or more of the audio segments as containing speech in one or more storage media may be effectuated.
    Type: Grant
    Filed: November 2, 2017
    Date of Patent: September 3, 2019
    Assignee: GoPro, Inc.
    Inventors: Tom M├ędioni, Vincent Garcia
  • Patent number: 10403279
    Abstract: A system for detecting and capturing voice commands, the system comprising a voice-activity detector (VAD) configured to receive a VAD-received digital-audio signal; determine the amplitude of the VAD-received digital-audio signal; compare the amplitude of the VAD-received digital-audio signal to a first threshold and to a second threshold; withhold a VAD interrupt signal when the amplitude of the VAD-received digital-audio signal does not exceed the first threshold or the second threshold; generate the VAD interrupt signal when the amplitude of the VAD-received digital-audio signal exceeds the first threshold and the second threshold; and perform spectral analysis of the VAD-received digital-audio signal when the amplitude of the VAD-received digital-audio signal is between the first threshold and the second threshold.
    Type: Grant
    Filed: September 15, 2017
    Date of Patent: September 3, 2019
    Assignee: Avnera Corporation
    Inventors: Xudong Zhao, Alexander C. Stange, Shawn O'Connor, Ali Hadiashar
  • Patent number: 10403265
    Abstract: An object is to provide a technique that allows voice recognition of voice including a plurality of languages while suppressing a data size of a voice recognition dictionary. A voice recognition dictionary includes a plurality of place name dictionaries and a plurality of house number dictionaries in which phonemes in a different language are mapped to phonemes in a corresponding language. Out of the plurality of place name dictionaries, one place name dictionary is set, which a language-specific voice recognition unit set by a voice recognition language setting unit may perform voice recognition in phonemes of the corresponding language, and out of the plurality of house number dictionaries, one house number dictionary is set, which the language-specific voice recognition unit may perform voice recognition by substituting phonemes in a different language for the phonemes in the corresponding language.
    Type: Grant
    Filed: December 24, 2014
    Date of Patent: September 3, 2019
    Assignee: Mitsubishi Electric Corporation
    Inventor: Yuzo Maruta
  • Patent number: 10395667
    Abstract: In accordance with embodiments of the present disclosure, a method for detecting near-field sources in an audio device may include computing a normalized cross correlation function between a first microphone signal and a second microphone signal, computing normalized auto correlation functions of each of the first microphone signal and the second microphone signal, partitioning the normalized cross correlation function and the normalized auto correlation functions into a plurality of time lag regions, computing for each respective time lag region of the plurality of the time lag regions a respective maximum deviation between the normalized cross correlation function and a normalized auto correlation function within the respective time lag region, combining the respective maximum deviations from the plurality of time lag regions to derive multiple detection statistics, and comparing each detection statistic of the multiple detection statistics to a respective threshold to detect a near-field signal.
    Type: Grant
    Filed: May 12, 2017
    Date of Patent: August 27, 2019
    Assignee: Cirrus Logic, Inc.
    Inventor: Samuel P. Ebenezer
  • Patent number: 10389657
    Abstract: A system and method for voice transmission over high level network protocols. On the Internet and the World Wide Web, such high level protocols are HTTP/TCP. The restrictions imposed by firewalls and proxy servers are avoided by using HTTP level connections to transmit voice data. In addition, packet delivery guarantees are obtained by using TCP instead of UDP. Variable compression based on silence detection takes advantage of the natural silences and pauses in human speech, thus reducing the delays in transmission caused by using HTTP/TCP. The silence detection includes the ability to bookend the voice data sent with small portions of silence to insure that the voice sounds natural. Finally, the voice data is transmitted to each client computer independently from a common circular list of voice data, thus insuring that all clients will stay current with the most recent voice data.
    Type: Grant
    Filed: October 9, 2013
    Date of Patent: August 20, 2019
    Assignee: OPEN INVENTION NETWORK, LLC.
    Inventors: Andrew W. Scherpbier, Mark Randle Boyns
  • Patent number: 10380265
    Abstract: A method for translation supply chain analytics includes receiving operational variables of a translation process from a translation supply chain. The method further includes determining a cognitive leverage and a productivity factor for post editing of matches of a plurality of match types generated by the translation supply chain based at least in part on the operational variables from the translation supply chain. The method further includes generating linguistic markers for the matches of the plurality of match types generated by the translation supply chain, based at least in part on the cognitive leverage and the productivity factor for the post editing of the matches of the plurality of match types. The method further includes performing statistical analysis of the linguistic markers for the matches of the plurality of match types. The method further includes generating one or more analytics outputs based on the statistical analysis of the linguistic markers.
    Type: Grant
    Filed: November 21, 2016
    Date of Patent: August 13, 2019
    Assignee: International Business Machines Corporation
    Inventors: Alejandro Martinez Corria, Francis X. Rojas, Linda F. Traudt, Saroj K. Vohra
  • Patent number: 10373630
    Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for distributed automatic speech recognition. An example apparatus includes a detector to process an input audio signal and identify a portion of the input audio signal including a sound to be evaluated, the sound to be evaluated organized into a plurality of audio features representing the sound. The example apparatus includes a quantizer to process the audio features using a quantization process to reduce the audio features to generate a reduced set of audio features for transmission. The example apparatus includes a transmitter to transmit the reduced set of audio features over a low-energy communication channel for processing.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: August 6, 2019
    Assignee: Intel Corporation
    Inventors: Binuraj K. Ravindran, Francis M. Tharappel, Prabhakar R. Datta, Tobias Bocklet, Maciej Muchlinski, Tomasz Dorau, Josef G. Bauer, Saurin Shah, Georg Stemmer
  • Patent number: 10373611
    Abstract: Methods and systems for modification of electronic system operation based on acoustic ambience classification are presented. In an example method, at least one audio signal present in a physical environment of a user is detected. The at least one audio signal is analyzed to extract at least one audio feature from the audio signal. The audio signal is classified based on the audio feature to produce at least one classification of the audio signal. Operation of an electronic system interacting with the user in the physical environment is modified based on the classification of the audio signal.
    Type: Grant
    Filed: January 3, 2014
    Date of Patent: August 6, 2019
    Assignee: Gracenote, Inc.
    Inventors: Suresh Jeyachandran, Vadim Brenner, Markus K. Cremer
  • Patent number: 10373604
    Abstract: An acoustic model is adapted, relating acoustic units to speech vectors. The acoustic model comprises a set of acoustic model parameters related to a given speech factor. The acoustic model parameters enable the acoustic model to output speech vectors with different values of the speech factor. The method comprises inputting a sample of speech which is corrupted by noise; determining values of the set of acoustic model parameters which enable the acoustic model to output speech with said first value of the speech factor; and employing said determined values of the set of speech factor parameters in said acoustic model. The acoustic model parameters are obtained by obtaining corrupted speech factor parameters using the sample of speech, and mapping the corrupted speech factor parameters to clean acoustic model parameters using noise characterization paramaters characterizing the noise.
    Type: Grant
    Filed: February 2, 2017
    Date of Patent: August 6, 2019
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Kayoko Yanagisawa
  • Patent number: 10366699
    Abstract: This disclosure describes, in part, techniques for performing multi-path calculations for energy levels on an electronic device. For instance, the electronic device may include a first circuit and a second circuit, where the first circuit uses less power than the second circuit. As such, when operating in a standby mode, the electronic device may use the first circuit to calculate energy levels at the electronic device, such as speech-energy values and ambient-energy values. Additionally, while operating in an active mode, the electronic device may active the second circuit and then use the second circuit to calculate the energy levels at the electronic device. The first circuit and the second circuit can send/receive current energy levels between one another so that the electronic device can continually calculate the energy levels even when the electronic device switches between modes of operation.
    Type: Grant
    Filed: August 31, 2017
    Date of Patent: July 30, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Bhupal Kanaiyalal Dharia, Dibyendu Nandy, Marko Bundalo, Hannan Ma
  • Patent number: 10354660
    Abstract: An endpoint device receives a sequence of audio frames. The endpoint device determines for each audio frame a respective importance level among possible importance levels ranging from a low importance level to a high importance level based on content in the audio frame indicative of the respective importance level. The endpoint device associates each audio frame with the respective importance level, to produce different subsets of audio frames associated with respective ones of different importance levels. The endpoint device, for each subset of audio frames, applies forward error correction to a fraction of audio frames in the subset of audio frames, wherein the fraction increases as the importance level of the audio frames in the subset increases, and does not apply forward error correction to remaining audio frames in the subset.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: July 16, 2019
    Assignee: Cisco Technology, Inc.
    Inventors: Ahmed Badr, Ashish J. Khisti, Wai-tian Tan, Michael A. Ramalho, John G. Apostolopoulos
  • Patent number: 10347273
    Abstract: A speech processing apparatus includes: an expectation value calculation unit configured to calculate, using an input signal spectrum and a speech model that models a feature quantity of speech, a spectrum expectation value which is an expectation value of a spectrum of an acoustic component included in the input signal spectrum; and an acoustic power estimation unit configured to estimate an acoustic power of the acoustic component of the input signal spectrum based on the input signal spectrum and the spectrum expectation value.
    Type: Grant
    Filed: December 8, 2015
    Date of Patent: July 9, 2019
    Assignee: NEC CORPORATION
    Inventors: Shuji Komeiji, Masanori Tsujikawa, Ryosuke Isotani
  • Patent number: 10343287
    Abstract: A robot voice direction-seeking turning system and method. The robot voice direction-seeking turning system employs a voice activity detection unit (1) that detects a received voice signal to determine whether or not a voice signal transmitted by a user (S2) is present; a direction-seeking angle for the voice signal is calculated by a voice direction-seeking unit (3), and a voice direction-seeking turning unit (4) is employed to drive a robot to turn towards the direction of the sound source of the voice signal on the basis of a direction-seeking angle (S4). Employment of the robot voice direction-seeking turning method allows accurate acquisition of a valid voice signal transmitted by the user, thus increasing signal-to-noise ratio and the accuracy of voice recognition.
    Type: Grant
    Filed: June 14, 2016
    Date of Patent: July 9, 2019
    Assignee: YUTOU TECHNOLOGY (HANGZHOU) CO., LTD.
    Inventors: Xin Liu, Peng Gao, Lichun Fan, Jiaqi Shi, Peng Cai, Mingjun Cai