Detect Speech In Noise Patents (Class 704/233)
  • Patent number: 10210857
    Abstract: A method of controlling an audio system comprises: receiving an audio signal, and applying a first gain to the audio signal and outputting an amplified audio signal. On receiving a user input to increase the first gain applied to the audio signal, if the first gain is at a first threshold value, the method comprises: receiving an ambient noise signal, processing the ambient noise signal with a second gain value and outputting a noise cancellation signal, and changing the second gain value in response to the user input.
    Type: Grant
    Filed: October 18, 2017
    Date of Patent: February 19, 2019
    Assignee: Cirrus Logic, Inc.
    Inventors: Nigel Burgess, Mark Allan Watts, Darren Holding
  • Patent number: 10204619
    Abstract: Methods, systems, and apparatus are described that receive audio data for an utterance. Association data is accessed that indicates associations between data corresponding to uncorrupted audio segments, and data corresponding to corrupted versions of the uncorrupted audio segments, where the associations are determined before receiving the audio data for the utterance. Using the association data and the received audio data for the utterance, data corresponding to at least one uncorrupted audio segment is selected. A transcription of the utterance is determined based on the selected data corresponding to the at least one uncorrupted audio segment.
    Type: Grant
    Filed: February 22, 2016
    Date of Patent: February 12, 2019
    Assignee: Google LLC
    Inventors: Olivier Siohan, Pedro J. Moreno Mengibar
  • Patent number: 10192547
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating an accent source. A system practicing the method collects data associated with customer specific services, generates country-specific or dialect-specific weights for each service in the customer specific services list, generates a summary weight based on an aggregation of the country-specific or dialect-specific weights, and sets an interactive voice response system language model based on the summary weight and the country-specific or dialect-specific weights. The interactive voice response system can also change the user interface based on the interactive voice response system language model. The interactive voice response system can tune a voice recognition algorithm based on the summary weight and the country-specific weights. The interactive voice response system can adjust phoneme matching in the language model based on a possibility that the speaker is using other languages.
    Type: Grant
    Filed: April 21, 2016
    Date of Patent: January 29, 2019
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventor: Nicholas Duffield
  • Patent number: 10194259
    Abstract: Various implementations include wearable audio devices and related methods for controlling such devices. In some particular implementations, a computer-implemented method of controlling a wearable audio device includes: receiving an initiation command to initiate a spatial audio mode; providing a plurality of audio samples corresponding with spatially delineated zones in an array defined relative to a physical position of the wearable audio device, in response to the initiation command, where each audio sample is associated with a source of audio content; receiving a selection command selecting one of the plurality of audio samples; and initiating playback of the source of audio content associated with the selected audio sample.
    Type: Grant
    Filed: February 28, 2018
    Date of Patent: January 29, 2019
    Assignee: BOSE CORPORATION
    Inventors: Keith Dana Martin, Todd Richard Reily, Mark Raymond Blewett, Daniel M. Gauger, Jr.
  • Patent number: 10186263
    Abstract: Speech recognition of a stream of spoken utterances is initiated. Thereafter, a spoken utterance stop event to stop the speech recognition is detected, such as in in relation to the stream. The spoken utterance stop event is other than a pause or cessation in the stream of spoken utterances. In response to the spoken utterance stop event being detected, the speech recognition of the stream of spoken utterances is stopped, while the stream of spoken utterances continues. After stopping the speech recognition of the stream of spoken utterances has been stopped, an action is caused to be performed that corresponds to the spoken utterances from a beginning of the stream through and until the spoken utterance stop event.
    Type: Grant
    Filed: August 30, 2016
    Date of Patent: January 22, 2019
    Assignee: Lenovo Enterprise Solutions (Singapore) PTE. LTD.
    Inventors: Amy Leigh Rose, John Scott Crowe, Gary David Cudak, Jennifer J. Lee-Baron, Nathan J. Peterson, Bryan L. Young
  • Patent number: 10182787
    Abstract: The present invention relates to systems and methods for characterizing at least one anatomical parameter of an upper airway of a patient by analysing spectral properties of an utterance, comprising: a mechanical coupler comprising means for restricting the jaw position of the patient; means for recording an utterance; and processing means for determining at least one anatomical parameter of the upper airway from the recorded utterance and comparing the recorded utterance to a threshold value. In addition the present invention relates to the use of the above mentioned systems as a diagnostics tool for assessing obstructive sleep apnea.
    Type: Grant
    Filed: October 11, 2012
    Date of Patent: January 22, 2019
    Assignee: KONINKLIJKE PHILIPS N.V.
    Inventors: Stijn De Waele, Stefan Winter, Alexander Cornelis Geerlings
  • Patent number: 10178113
    Abstract: Systems, methods, and media for generating sanitized data, sanitizing anomaly detection models, and generating anomaly detection models are provided. In some embodiments, methods for sanitizing anomaly detection models are provided. The methods including: receiving at least one abnormal anomaly detection model from at least one remote location; comparing at least one of the at least one abnormal anomaly detection model to a local normal detection model to produce a common set of features common to both the at least one abnormal anomaly detection model and the local normal detection model; and generating a sanitized normal anomaly detection model by removing the common set of features from the local normal detection model.
    Type: Grant
    Filed: July 13, 2015
    Date of Patent: January 8, 2019
    Assignee: The Trustees of Columbia University in the City of New York
    Inventors: Gabriela F. Ciocarlie, Angelos Stavrou, Salvatore J. Stolfo, Angelos D. Keromytis
  • Patent number: 10134395
    Abstract: Techniques for providing virtual assistants to assist users during a voice communication between the users. For instance, a first user operating a device may establish a voice communication with respective devices of one or more additional users, such as with a device of a second user. For instance, the first user may utilize her device to place a telephone call to the device of the second user. A virtual assistant may also join the call and, upon invocation by a user on the call, may identify voice commands from the call and may perform corresponding tasks for the users in response.
    Type: Grant
    Filed: September 25, 2013
    Date of Patent: November 20, 2018
    Assignee: Amazon Technologies, Inc.
    Inventor: Marcello Typrin
  • Patent number: 10123221
    Abstract: A communication device can be configured to estimate a Reference-Signal-Received-Power (RSRP). The communication device can include a transceiver and a controller. The transceiver can be configured to downsample a received signal having a plurality of reference signal resource elements to generate a downsampled signal. The downsampling can alias a first of the plurality of reference signal resource elements into a second of the plurality of reference signal resource elements to generate an aliased reference signal resource element. The transceiver can also be configured to extract the aliased reference signal resource element from the downsampled signal. The controller can be connected to the transceiver and be configured to estimate the RSRP based on the extracted aliased reference signal resource element.
    Type: Grant
    Filed: September 23, 2016
    Date of Patent: November 6, 2018
    Assignee: Intel IP Corporation
    Inventors: Matthew Hayes, Denis Markovic, Viswanath Vajepeyazula
  • Patent number: 10115019
    Abstract: A video may be categorized into a picture category or a video category. A key frame of the video includes a face and a face feature in the key frame is obtained. Face features respectively associated with a plurality of picture categories are acquired and the video is assigned to one of the picture categories based on a comparison of the key frame face feature and the face features of the picture categories. Videos may first be associated with a video category by comparing key frame face features from the videos, and then the video category may be assigned to a picture category based on comparison of a video category face feature with a plurality of picture category face features. Alternatively, a video may be assigned to a picture category based on matching capture times and capture locations between the video and a reference picture in the picture category.
    Type: Grant
    Filed: August 19, 2016
    Date of Patent: October 30, 2018
    Assignee: Xiaomi Inc.
    Inventors: Zhijun Chen, Wendi Hou, Fei Long
  • Patent number: 10109277
    Abstract: Methods and apparatus for using visual information to facilitate a speech recognition process. The method comprises dividing received audio information into a plurality of audio frames, determining for each of the plurality of audio frames, whether the audio information in the audio frame comprises speech from the foreground speaker, wherein the determining is based, at least in part, on received visual information, and transmitting the audio frame to an automatic speech recognition (ASR) engine for speech recognition when it is determined that the audio frame comprises speech from the foreground speaker.
    Type: Grant
    Filed: April 27, 2015
    Date of Patent: October 23, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Etienne Marcheret, Josef Vopicka, Vaibhava Goel
  • Patent number: 10102850
    Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.
    Type: Grant
    Filed: February 25, 2013
    Date of Patent: October 16, 2018
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Kenneth John Basye, Jeffrey Penrod Adams
  • Patent number: 10088893
    Abstract: According to some embodiments, a sensor network may be provided with re-programmable and/or reconfigurable analog circuitry configured to monitor data collected by the sensor network. The re-programmable and/or reconfigurable analog circuitry may also generate a wakeup signal in response to a defined wakeup event detected by the sensor network.
    Type: Grant
    Filed: May 12, 2017
    Date of Patent: October 2, 2018
    Inventors: Vinod Kulathumani, David W. Graham, Brandon David Rumberg
  • Patent number: 10090005
    Abstract: According to some embodiments, an analog processing portion may receive an audio signal from a microphone. The analog processing portion may then convert the audio signal into sub-band signals and estimate an energy statistic value, such as a Signal-to-Noise Ratio (“SNR”) value, for each sub-band signal. A classification element may classify the estimated energy statistic values with analog processing such that a wakeup signal is generated when voice activity is detected. The wakeup signal may be associated with, for example, a battery-powered, always-listening audio application.
    Type: Grant
    Filed: March 10, 2017
    Date of Patent: October 2, 2018
    Assignee: ASPINITY, INC.
    Inventors: Brandon David Rumberg, David W. Graham
  • Patent number: 10083687
    Abstract: Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information based on a previously recorded time and speed of the caller, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model.
    Type: Grant
    Filed: October 16, 2017
    Date of Patent: September 25, 2018
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventor: Mazin Gilbert
  • Patent number: 10083710
    Abstract: A voice control system including a voice receiving unit, an image capturing unit, a storage unit and a control unit is disclosed. The voice receiving unit receives a voice. The image capturing unit captures a video image stream including several human face images. The storage unit stores the voice and the video image stream. The control unit is electrically connected to the voice receiving unit, the image capturing unit and the storage unit. The control unit detects a feature of a human face from the human face images, defines a mouth motion detection region from the feature of the human face, and generates a control signal according to a variation of the mouth motion detection region and a variation of the voice over time. A voice control method, a computer program product and a computer readable medium are also disclosed.
    Type: Grant
    Filed: May 17, 2016
    Date of Patent: September 25, 2018
    Assignee: BXB Electronics Co., Ltd.
    Inventors: Kai-Sheng Chiou, Chih-Lin Hung, Chung-Nan Lee, Chao-Wen Wu
  • Patent number: 10084475
    Abstract: An improved mixed oscillator-and-external excitation model and methods for estimating the model parameters, for evaluating model quality, and for combining it with known in the art methods are disclosed. The improvement over existing oscillators allows the model to receive, as an input, all except the most recent point in the acquired data. Model stability is achieved through a process which includes restoring unavailable to the decoder data from the optimal model parameters and by using metrics to select a stable restored model output. The present invention is effective for very low bit-rate coding/compression and decoding/decompression of digital signals, including digitized speech, audio, and image data, and for analysis, detection, and classification of signals. Operations can be performed in real time, and parameterization can be achieved at a user-specified level of compression.
    Type: Grant
    Filed: October 28, 2011
    Date of Patent: September 25, 2018
    Inventors: Irina Gorodnitsky, Anton Yen
  • Patent number: 10078690
    Abstract: It is provided a method for triggering an action on a second device. It comprises the steps of obtaining audio of a multimedia content presented on a first device; comparing the obtained audio with reference audio data in a database; if finding the obtained audio exists in the database containing reference audio, determining an action corresponding to the matched reference audio; and triggering the action in the second device.
    Type: Grant
    Filed: December 31, 2011
    Date of Patent: September 18, 2018
    Assignee: Thomson Licensing DTV
    Inventors: Jianfeng Chen, Xiaojun Ma, Zhigang Zhang
  • Patent number: 10074360
    Abstract: This relates to providing an indication of the suitability of an acoustic environment for performing speech recognition. One process can include receiving an audio input and determining a speech recognition suitability based on the audio input. The speech recognition suitability can include a numerical, textual, graphical, or other representation of the suitability of an acoustic environment for performing speech recognition. The process can further include displaying a visual representation of the speech recognition suitability to indicate the likelihood that a spoken user input will be interpreted correctly. This allows a user to determine whether to proceed with the performance of a speech recognition process, or to move to a different location having a better acoustic environment before performing the speech recognition process.
    Type: Grant
    Filed: August 24, 2015
    Date of Patent: September 11, 2018
    Assignee: Apple Inc.
    Inventor: Yoon Kim
  • Patent number: 10074369
    Abstract: Systems, methods, and devices for escalating voice-based interactions via speech-controlled devices are described. Speech-controlled devices capture audio, including wakeword portions and payload portions, for sending to a server to relay messages between speech-controlled devices. In response to determining the occurrence of an escalation event, such as repeated messages between the same two devices, the system may automatically change a mode of a speech-controlled device, such as no longer requiring a wakeword, no longer requiring an indication of a desired recipient, or automatically connecting two speech-controlled devices in a voice-chat mode. In response to determining the occurrence of further escalation events, the system may initiate a real-time call between the speech-controlled devices.
    Type: Grant
    Filed: September 1, 2016
    Date of Patent: September 11, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Christo Frank Devaraj, Manish Kumar Dalmia, Tony Roy Hardie, Ran Mokady, Nick Ciubotariu, Sandra Lemon
  • Patent number: 10049653
    Abstract: A system including an automatic noise canceling (ANC) headphone and a processor. The ANC headphone has a microphone configured to generate a microphone signal and at least two non-zero ANC gain levels. The processor is configured to receive the microphone signal, determine a characteristic of the microphone signal, identify a revised ANC level from the ANC gain levels based on a comparison of the characteristic to at least one threshold, and output a signal corresponding to the revised ANC level. Methods are also disclosed.
    Type: Grant
    Filed: October 14, 2016
    Date of Patent: August 14, 2018
    Assignee: AVNERA CORPORATION
    Inventors: Amit Kumar, Eric Sorensen
  • Patent number: 10032454
    Abstract: Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results.
    Type: Grant
    Filed: June 25, 2015
    Date of Patent: July 24, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Shilei Zhang, Shenghua Bao, Wen Liu, Yong Qin, Zhiwei Shuang, Jian Chen, Zhong Su, Qin Shi, William F. Ganong, III
  • Patent number: 10026399
    Abstract: Architectures and techniques for selecting a voice-enabled device to handle audio input that is detected by multiple voice-enabled devices are described herein. In some instances, multiple voice-enabled devices may detect audio input from a user at substantially the same time, due to the voice-enabled devices being located within proximity to the user. The architectures and techniques may analyze a variety of audio signal metric values for the voice-enabled devices to designate a voice-enabled device to handle the audio input.
    Type: Grant
    Filed: September 11, 2015
    Date of Patent: July 17, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Ramya Gopalan, Shiva Kumar Sundaram
  • Patent number: 10014002
    Abstract: Methods and systems for audio source separation in real-time are described. In an embodiment, the present disclosure describes reading and decoding an audio source into PCM samples, fragmenting Pulse Code Modulation (PCM) samples into fragments, transforming fragments into spectrograms, performing audio source separation using a deep neural network (DNN) to generate an estimated magnitude spectrogram of the component(s) of the audio source, reconstructing the estimated time domain component signals, and streaming the component signals to a playback engine. In an embodiment, a semantic equalizer graphical user allows for real-time mixing of individual component signals.
    Type: Grant
    Filed: October 24, 2017
    Date of Patent: July 3, 2018
    Assignee: Red Pill VR, Inc.
    Inventors: Alejandro Koretzky, Karthiek Reddy Bokka, Naveen Sasalu Rajashekharappa
  • Patent number: 9990936
    Abstract: A method and an apparatus for separating speech data from background data in an audio communication are suggested. The method comprises: applying a speech model to the audio communication for separating the speech data from the background data of the audio communication; and updating the speech model as a function of the speech data and the background data during the audio communication.
    Type: Grant
    Filed: October 12, 2015
    Date of Patent: June 5, 2018
    Assignee: THOMSON Licensing
    Inventors: Alexey Ozerov, Quang Khanh Ngoc Duong, Louis Chevallier
  • Patent number: 9984683
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for automatic speech recognition using multi-dimensional models. In some implementations, audio data that describes an utterance is received. A transcription for the utterance is determined using an acoustic model that includes a neural network having first memory blocks for time information and second memory blocks for frequency information. The transcription for the utterance is provided as output of an automated speech recognizer.
    Type: Grant
    Filed: July 22, 2016
    Date of Patent: May 29, 2018
    Assignee: Google LLC
    Inventors: Bo Li, Tara N. Sainath
  • Patent number: 9978392
    Abstract: Traditionally known classification methods of non-stationary physiological audio signals as noisy and clean involve human intervention, may involve dependency on particular type of classifier and further analyses is carried out on classified clean signals. However, in non-stationary audio signals a major portion may end up being classified as noisy and hence may get rejected which may cause missing of intelligence which could have been derived from lightly noisy audio signals that may be critical. The present disclosure enables automation of classification based on auto-thresholding and statistical isolation wherein noisy signals are further classified as highly noisy and lightly noisy through continuous dynamic learning.
    Type: Grant
    Filed: March 10, 2017
    Date of Patent: May 22, 2018
    Assignee: Tata Consultancy Services Limited
    Inventors: Arijit Ukil, Soma Bandyopadhyay, Chetanya Puri, Arpan Pal, Rituraj Singh, Ayan Mukherjee, Debayan Mukherjee
  • Patent number: 9978378
    Abstract: An apparatus for decoding an audio signal is provided, having a receiving interface, configured to receive a first frame having a first audio signal portion of the audio signal, and configured to receive a second frame having a second audio signal portion of the audio signal; a noise level tracing unit, wherein the noise level tracing unit is configured to determine noise level information depending on at least one of the first audio signal portion and the second audio signal portion; a first reconstruction unit for reconstructing, in a first reconstruction domain, a third audio signal portion of the audio signal depending on the noise level information; a transform unit for transforming the noise level information to a second reconstruction domain; and a second reconstruction unit for reconstructing, in the second reconstruction domain, a fourth audio signal portion of the audio signal depending on the noise level information.
    Type: Grant
    Filed: December 21, 2015
    Date of Patent: May 22, 2018
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Michael Schnabel, Goran Markovic, Ralph Sperschneider, Jérémie Lecomte, Christian Helmrich
  • Patent number: 9978373
    Abstract: A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.
    Type: Grant
    Filed: June 17, 2016
    Date of Patent: May 22, 2018
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventor: Robert Wesley Bossemeyer, Jr.
  • Patent number: 9967665
    Abstract: An analog signal path portion of a signal path may have: (i) an audio input for receiving an analog signal, an audio output for providing an output signal, and a selectable analog gain, and may be configured to generate the output signal based on the analog signal and in conformity with the selectable analog gain; and (ii) a digital path portion having a selectable digital gain and configured to receive a digital input signal and convert the digital input signal into the analog signal in conformity with the selectable digital gain. A control circuit may be configured to modify the digital and analog gains in response to an indication to switch between gain modes of the signal path, determine a noise floor of an audio signal comprising the digital input signal or a signal derived therefrom, and control modification of the digital and analog gains based on the noise floor.
    Type: Grant
    Filed: October 5, 2016
    Date of Patent: May 8, 2018
    Assignee: Cirrus Logic, Inc.
    Inventors: Tejasvi Das, Ku He, John L. Melanson
  • Patent number: 9966059
    Abstract: An acoustic interference cancellation system that performs beamforming using a subset of microphones from a microphone array. For example, a first group of microphones from an array can be used to generate target signals that focus on the direction of the desired speech in the audio and a second group of microphones from the array can be used to generate reference signals that include the environmental noise, audio from a loudspeaker, etc. The reference signals of the second group of microphones can then be used to isolate the actual speech from the target signals of the first group of microphones. The microphone array can be three dimensional, allowing a device to simplify beamforming calculations by selecting subsets of microphones along different planes. In addition, directional microphones and remote microphones may be used to improve a quality of the reference signals.
    Type: Grant
    Filed: September 6, 2017
    Date of Patent: May 8, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Robert Ayrapetian, Philip Ryan Hilmes, Carlo Murgia
  • Patent number: 9966085
    Abstract: A noise suppression circuit for use in an audio signal processing circuit is provided. The noise suppression circuit includes a plurality of different types of noise activity detectors, which are each adapted for detecting the presence of a different type of noise in a received signal. The noise suppression circuit further includes a plurality of different types of noise reduction circuits, which are each adapted for removing a different type of detected noise, where each noise reduction circuit respectively corresponds to one of the plurality of noise activity detectors. The respective noise reduction circuit is then selectively activated to condition the received signal to reduce the amount of the detected types of noise, when each one of the plurality of noise activity detectors detects the presence of a corresponding type of noise in the received signal.
    Type: Grant
    Filed: August 21, 2007
    Date of Patent: May 8, 2018
    Assignee: GOOGLE TECHNOLOGY HOLDINGS LLC
    Inventors: Jianming J. Song, Joel A. Clark
  • Patent number: 9961441
    Abstract: Methods and systems are provided for enhancing listening intelligibility in electronic devices. A vibration sensor may be used to generate feedback corresponding to vibrations caused by the outputting of the acoustic signals, and the feedback may be used in adjusting the listening intelligibility stage. In some instances, a microphone may be used to obtain audio input corresponding to ambient noise affecting intelligibility of audio outputted, as acoustic signals, via a speaker, to a user. The audio input may be used to control a listening intelligibility stage applied to audio content when the acoustic signals are generated for outputting by the speaker. In particular, the listening intelligibility stage may comprise application of dynamic time-scale modifications.
    Type: Grant
    Filed: June 25, 2014
    Date of Patent: May 1, 2018
    Assignee: DSP Group Ltd.
    Inventor: Yaakov Chen
  • Patent number: 9959872
    Abstract: Aspects relate to computer implemented methods, systems, and processes to automatically generate audio-based display indicia of media content including receiving, by a processor, a plurality of media content categories including at least one feature, receiving a plurality of categorized speech recognition algorithms, each speech recognition algorithm being associated with a respective one or more of the plurality of media content categories, determining a media content category of a current media content based on at least one feature of the current media content, selecting one speech recognition algorithm from the plurality of categorized speech recognition algorithms based on the determination of the media content category of the current media content, and applying the selected speech recognition algorithm to the current media content.
    Type: Grant
    Filed: December 14, 2015
    Date of Patent: May 1, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Priscilla Barreira Avegliano, Carlos Henrique Cardonha, Stefany Mazon, Julio Nogima
  • Patent number: 9961517
    Abstract: A communication system includes a headset including a microphone and an audio speaker installed on the headset, the headset including an RF transceiver configured to perform wireless communication with a two-way radio having a push-to-talk communication channel. The RF transceiver is further configured to perform wireless communication with an information handling system. The headset includes processing electronics configured to process an input signal from the microphone and output a first processed signal to the RF transceiver and to process an input signal from the RF transceiver and output a second processed signal to the speaker. A remote control unit is configured to perform wireless communication with the headset, the remote control unit including a remote control unit interface disposed thereon for selectively configuring the headset to function as an audio interface for the push-to-talk communication channel and an audio interface for the information handling system.
    Type: Grant
    Filed: May 2, 2017
    Date of Patent: May 1, 2018
    Assignee: Wilcox Industries Corp.
    Inventors: James W. Teetzel, Travis S. Mitchell
  • Patent number: 9953661
    Abstract: A “running range normalization” method includes computing running estimates of the range of values of features useful for voice activity detection (VAD) and normalizing the features by mapping them to a desired range. Running range normalization includes computation of running estimates of the minimum and maximum values of VAD features and normalizing the feature values by mapping the original range to a desired range. Smoothing coefficients are optionally selected to directionally bias a rate of change of at least one of the running estimates of the minimum and maximum values. The normalized VAD feature parameters are used to train a machine learning algorithm to detect voice activity and to use the trained machine learning algorithm to isolate or enhance the speech component of the audio data.
    Type: Grant
    Filed: September 25, 2015
    Date of Patent: April 24, 2018
    Assignee: CIRRUS LOGIC INC.
    Inventor: Earl Vickers
  • Patent number: 9940949
    Abstract: In a speech-based system, a wake word or other trigger expression is used to preface user speech that is intended as a command. The system receives multiple directional audio signals, each of which emphasizes sound from a different direction. The trigger expression is detected in an individual directional audio signal by comparing a confidence score with a confidence threshold. An individual confidence threshold is specified for each directional audio signal. The confidence thresholds are adjusted during operation of the system based on performance information that is generated during operation of the system. As an example, performance information may include the number of times that the trigger expression has been detected in each of the directional audio signals.
    Type: Grant
    Filed: December 19, 2014
    Date of Patent: April 10, 2018
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Shiv Naga Prasad Vitaladevuni, Philip Ryan Hilmes
  • Patent number: 9940936
    Abstract: According to some aspects, a method of monitoring an acoustic environment of a mobile device, at least one computer readable medium encoded with instructions that, when executed, perform such a method and/or a mobile device configured to perform such a method is provided. The method comprises receiving, by the mobile device, acoustic input from the environment of the mobile device, detecting whether the acoustic input includes a voice command from a user without requiring receipt of an explicit trigger from the user, and initiating responding to the detected voice command.
    Type: Grant
    Filed: July 30, 2015
    Date of Patent: April 10, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Vladimir Sejnoha, Paul Adrian Van Mulbregt, Glen Edward Wilson, William F. Ganong, III
  • Patent number: 9928848
    Abstract: An audio signal processing system removes at least a portion of a noise component from a number of audio input signals generated by a number of closely proximate agents within an input signal source location. The availability of each audio input signal and the geographically proximate location of each of the agents creating an audio input signal facilitates the real-time or near real-time reduction in ambient noise level in each of the audio input signals using a Blind Sound Source Separation (BSSS) technique.
    Type: Grant
    Filed: December 24, 2015
    Date of Patent: March 27, 2018
    Assignee: INTEL CORPORATION
    Inventors: Niall Cahill, Jakub Wenus, Mark Y. Kelly, Michael Nolan
  • Patent number: 9916833
    Abstract: An apparatus for decoding an audio signal includes a receiving interface, wherein the receiving interface is configured to receive a first frame and a second frame. Moreover, the apparatus includes a noise level tracing unit for determining noise level information being represented in a tracing domain. Furthermore, the apparatus includes a first reconstruction unit for reconstructing a third audio signal portion of the audio signal depending on the noise level information and a second reconstruction unit for reconstructing a fourth audio signal portion depending on noise level information being represented in the second reconstruction domain.
    Type: Grant
    Filed: December 18, 2015
    Date of Patent: March 13, 2018
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Michael Schnabel, Goran Markovic, Ralph Sperschneider, Jeremie Lecomte, Christian Helmrich
  • Patent number: 9916846
    Abstract: A system and method for determining an amount of speech in an audio signal may include for example: obtaining segments of the audio signal, wherein the segments are grouped into blocks; for each one of the segments, calculating a segment value indicative of an amplitude of the audio signal of a respective segment; for each one of the blocks calculating a block value indicative of the amplitude of the audio signal of a respective block; and calculating an audio signal speech grade based on segment values and block values, wherein the audio signal speech grade is indicative of the amount of speech in the audio signal.
    Type: Grant
    Filed: February 10, 2015
    Date of Patent: March 13, 2018
    Assignee: NICE LTD.
    Inventors: Frits Lassche, Ivar Meijer, Victor Bastiaan Mosch, Steven St. John Logan, Jurgen Willem Wessel, Gerardus B. J. Stam
  • Patent number: 9911416
    Abstract: A method for controlling an electronic device in response to speech spoken by a user is disclosed. The method may include receiving an input sound by a sound sensor. The method may also detect the speech spoken by the user in the input sound, determine first characteristics of a first frequency range and second characteristics of a second frequency range of the speech in response to detecting the speech in the input sound, and determine whether a direction of departure of the speech spoken by the user is toward the electronic device based on the first and second characteristics.
    Type: Grant
    Filed: March 27, 2015
    Date of Patent: March 6, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Sungrack Yun, Taesu Kim, Duck Hoon Kim, Kyuwoong Hwang
  • Patent number: 9911430
    Abstract: A system for providing an acoustic environment recognizer for optimal speech processing is disclosed. In particular, the system may utilize metadata obtained from various acoustic environments to assist in suppressing ambient noise interfering with a desired audio signal. In order to do so, the system may receive an audio stream including an audio signal associated with a user and including ambient noise obtained from an acoustic environment of the user. The system may obtain first metadata associated with the ambient noise, and may determine if the first metadata corresponds to second metadata in a profile for the acoustic environment. If the first metadata corresponds to the second metadata, the system may select a processing scheme for suppressing the ambient noise from the audio stream, and process the audio stream using the processing scheme. Once the audio stream is processed, the system may provide the audio stream to a destination.
    Type: Grant
    Filed: November 28, 2016
    Date of Patent: March 6, 2018
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Horst J. Schroeter, Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji
  • Patent number: 9908153
    Abstract: The present invention is related to methods and systems for collected item information for stored items. In one embodiment, a networked food storage system comprises a first sensor configured to read information from item tags coupled to items, wherein the items are stored or intended to be stored in a storage unit. A data store is configured to store food preferences for at least a first user. Instructions, stored in computer readable memory, are configured to: cause a first user interface to be displayed to the first user via which the first user can request a meal suggestion; retrieve preference information for the first user from computer readable memory; retrieve information read from at least a first item tag; and provide a meal suggestion based at least in part on preference information for the first user and item tag information.
    Type: Grant
    Filed: May 24, 2017
    Date of Patent: March 6, 2018
    Assignee: IKAN HOLDINGS LLC
    Inventors: Fabio Zsigmond, Sion Elie Douer, Geraldo Yoshizawa, Frederico Wagner
  • Patent number: 9899028
    Abstract: An information processing device includes a first information processing unit, a communication unit, and a control unit. The first information processing unit performs predetermined information processing on input data to generate first processing result data. The communication unit is capable of receiving second processing result data generated by a second information processing unit capable of executing the same kind of information processing as the information processing on the input data under a condition with higher versatility. The control unit selects either the first processing result data or the second processing result data according to the use environment of the device.
    Type: Grant
    Filed: August 14, 2015
    Date of Patent: February 20, 2018
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro Nakadai, Takeshi Mizumoto, Keisuke Nakamura, Masayuki Takigahira
  • Patent number: 9898847
    Abstract: The present disclosure provides a multimedia picture generating method, device and electronic device, wherein the multimedia picture generating method comprises acquiring a picture of a photographed subject of a photographing device; extracting a figure image as a foreground image from the picture after receiving an instruction for removing picture background; performing voice recognition after receiving a voice command inputted by a user; searching out multimedia content that matches a user command information recognized by voice recognition from a multimedia database as background content for the picture; and generating a multimedia picture that contains the foreground image and the background content.
    Type: Grant
    Filed: August 19, 2016
    Date of Patent: February 20, 2018
    Assignee: Shanghai Sunson Activated Carbon Technology Co., Ltd.
    Inventor: Zhenyu Wang
  • Patent number: 9892731
    Abstract: The present invention relates to implementing a system and method to improve speech recognition and speech enhancement of noisy speech. The present invention discloses a way to improve the noise robustness of a speech recognition system by providing additional input to a Neural Network speech classifier. The additional information characterizes the noise environment of the speech. The present invention further discloses a speech separation system that uses the output of the neural network. The speech separation system employs models for the speech and for the distractor or noise. The neural network is used to identify the most likely combinations of speech and noise. Furthermore, a system for efficiently finding the most likely clean speech log-spectrum value is disclosed.
    Type: Grant
    Filed: September 28, 2016
    Date of Patent: February 13, 2018
    Inventor: Trausti Thor Kristjansson
  • Patent number: 9870765
    Abstract: Methods and a system are provided for estimating automatic speech recognition (ASR) accuracy. A method includes obtaining transcriptions of utterances in a conversation over two channels. The method further includes sorting the transcriptions along a time axis using a forced alignment. The method also includes training a language model with the sorted transcriptions. The method additionally includes performing ASR for utterances in a conversation between a first user and a second user. The second user is a target of ASR accuracy estimation. The method further includes determining whether an ASR result of the second user is consistent or inconsistent with an ASR result of the first user using the trained language model. The method also includes estimating the ASR result of the second user as poor responsive to the ASR result of the second user being as inconsistent with the ASR result of the first user.
    Type: Grant
    Filed: June 3, 2016
    Date of Patent: January 16, 2018
    Assignee: International Business Machines Corporation
    Inventors: Gakuto Kurata, Masayuki A. Suzuki
  • Patent number: 9870772
    Abstract: A guiding device, a guiding method, a program, and an information storage medium are provided which can perform output control of a guidance related to a volume at which to input voice using the recognition ranking of a received voice. A voice receiving section (46) receives a voice. When given information is identified as a result of recognition of the voice, an output control section (58) performs control so as to output a guidance related to a volume at which to input voice in a mode corresponding to the recognition ranking of the information.
    Type: Grant
    Filed: May 1, 2015
    Date of Patent: January 16, 2018
    Assignee: Sony Interactive Entertainment Inc.
    Inventor: Kotaro Imamura
  • Patent number: 9858949
    Abstract: An acoustic processing apparatus includes a sound source localization unit configured to estimate a direction of a sound source from an acoustic signal of a plurality of channels, a sound source separation unit configured to perform separation into a sound-source-specific acoustic signal representing a component of the sound source from the acoustic signal of the plurality of channels, and a sound source identification unit configured to determine a type of sound source on the basis of the direction of the sound source estimated by the sound source localization unit using model data representing a relationship between the direction of the sound source and the type of sound source, for the sound-source-specific acoustic signal.
    Type: Grant
    Filed: July 18, 2016
    Date of Patent: January 2, 2018
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro Nakadai, Ryosuke Kojima