Detect Speech In Noise Patents (Class 704/233)
-
Patent number: 11831812Abstract: This disclosure describes a conferencing device with beamforming and echo cancellation that includes: a microphone array that further comprises a plurality of microphones oriented to develop a corresponding plurality of microphone signals; a processor configured to execute the following steps: (1) performing a beamforming operation to combine the plurality of microphone signals from the microphone array into a plurality of combined signals, (2) performing an acoustic echo cancellation operation on the plurality of combined signals to generate a plurality of combined echo cancelled signals, (3) receiving with a voice activity detector the far end signal as an input, (4) selecting one or more of the combined echo cancelled signals for transmission to the far end where a signal selector uses the far end signal as information to inhibit the signal selector from changing the selection of the combined echo cancelled signals while only the far end signal is active.Type: GrantFiled: November 22, 2022Date of Patent: November 28, 2023Assignee: ClearOne, Inc.Inventors: Ashutosh Pandey, Darrin T. Thurston, David K. Lambert, Tracy A. Bathurst
-
Patent number: 11804241Abstract: An electronic apparatus and a controlling method thereof are provided. The controlling method includes, based on an audio signal being received through a microphone, determining whether a user is on a public transport; detecting whether the audio signal includes a voice signal output through an acoustic device of the public transport; determining whether the voice signal from the acoustic device includes a voice signal for guiding at least one stop from among a plurality of stops; and outputting information on the at least one stop.Type: GrantFiled: January 18, 2022Date of Patent: October 31, 2023Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Jubum Han, Changwoo Han
-
Patent number: 11798574Abstract: A speech separation device (12) of a speech separation system includes a feature amount extraction unit (121) configured to extract time-series data of a speech feature amount of mixed speech, a block division unit (122) configured to divide the time-series data of the speech feature amount into blocks having a certain time width, a speech separation neural network (1b) configured to create time-series data of a mask of each of a plurality of speakers from the time-series data of the speech feature amount divided into blocks, and a speech restoration unit (123) configured to restore the speech data of each of the plurality of speakers from the time-series data of the mask and the time-series data of the speech feature amount of the mixed speech.Type: GrantFiled: January 12, 2021Date of Patent: October 24, 2023Assignees: MITSUBISHI ELECTRIC CORPORATION, MITSUBISHI ELECTRIC RESEARCH LABORATORIES, INC.Inventors: Ryo Aihara, Toshiyuki Hanazawa, Yohei Okato, Gordon P Wichern, Jonathan Le Roux
-
Patent number: 11783352Abstract: A method, system, and device for audio-based identification interfaces for selecting objects from video generates and stores frequency-based audio identifiers associated with segments of an audio stream that is integrated with a video stream. The generation of the frequency-based audio identifiers may be performed by a hashing function applied to audio frequencies within audio segments. The video stream comprises identified objects that may be identified by application of a trained neural network. An audio segment is received from a user and a corresponding frequency-based audio identifier is generated and matched against stored frequency-based audio identifiers. The matching determines an audio segment and a temporally corresponding identified object, which is then embodied within an interactive user interface.Type: GrantFiled: January 20, 2023Date of Patent: October 10, 2023Assignee: Revealit CorporationInventors: Garry Anthony Smith, Zachary Oakes, Steven Dennis Flinn
-
Patent number: 11763805Abstract: A speaker recognition method and apparatus receives a first voice signal of a speaker, generates a second voice signal by enhancing the first voice signal through speech enhancement, generates a multi-channel voice signal by associating the first voice signal with the second voice signal, and recognizes the speaker based on the multi-channel voice signal.Type: GrantFiled: May 27, 2022Date of Patent: September 19, 2023Assignee: Samsung Electronics Co., Ltd.Inventors: Sung-Jae Cho, Kyuhong Kim, Jaejoon Han
-
Patent number: 11741982Abstract: An audio processing system includes a microphone array, a speech detection system, and a neural network noise reduction module. The microphone array includes at least two microphones and provides an audio signal from an environment surrounding the microphone array. The speech detection system receives the audio signal, and processes the audio signal to a) detect that a first user is speaking, b) determine a first direction relative to the audio array when the first user is located at a first location within the environment, and c) provide beamforming processing on the audio signal in the first direction, and to provide a processed audio signal based upon the beamforming processing. The neural network noise reduction module reduces noise in the processed audio signal.Type: GrantFiled: October 5, 2021Date of Patent: August 29, 2023Assignee: Dell Products L.P.Inventors: Cola Hung Shih, Vivek Viswanathan Iyer
-
Patent number: 11699456Abstract: Systems and methods are described for generating a transcript of a legal proceeding or other multi-speaker conversation or performance in real time or near-real time using multi-channel audio capture. Different speakers or participants in a conversation may each be assigned a separate microphone that is placed in proximity to the given speaker, where each audio channel includes audio captured by a different microphone. Filters may be applied to isolate each channel to include speech utterances of a different speaker, and these filtered channels of audio data may then be processed in parallel to generate speech-to-text results that are interleaved to form a generated transcript.Type: GrantFiled: February 12, 2021Date of Patent: July 11, 2023Assignee: Veritext, LLCInventors: Anthony Donofrio, David Joseph DaSilva, James Andrew Maraska, Jr., Jonathan Mordecai Kaplan
-
Patent number: 11659325Abstract: A voice processing method, an electronic device and a readable storage medium, which relate to the field of voice processing technologies, are disclosed. The method includes: collecting a first audio signal; processing the first audio signal using a preset algorithm to obtain a second audio signal; and sending the second audio signal to a first device, such that the first device performs a voice processing operation on the second audio signal.Type: GrantFiled: December 23, 2021Date of Patent: May 23, 2023Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Jingran Li, Liufeng Wang
-
Patent number: 11621015Abstract: A training speech data generating apparatus includes: a voice conversion unit that converts, using fourth noise data, which is noise data based on third noise data, and speech data, the speech data so as to make the speech data clearly audible under a noise environment corresponding to the fourth noise data; and a noise superimposition unit that obtains training speech data by superimposing the third noise data and the converted speech data.Type: GrantFiled: March 11, 2019Date of Patent: April 4, 2023Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Takaaki Fukutomi, Manabu Okamoto, Takashi Nakamura, Kiyoaki Matsui
-
Patent number: 11620989Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes generating, by a speech recognition system, a matrix from a predetermined quantity of vectors that each represent input for a layer of a neural network, generating a plurality of sub-matrices from the matrix, using, for each of the sub-matrices, the respective sub-matrix as input to a node in the layer of the neural network to determine whether an utterance encoded in an audio signal comprises a keyword for which the neural network is trained.Type: GrantFiled: June 26, 2019Date of Patent: April 4, 2023Assignee: Google LLCInventors: Ignacio Lopez Moreno, Yu-hsin Joyce Chen
-
Patent number: 11551651Abstract: Systems, devices, and methods for segmenting musical compositions are described. Discrete, musically-coherent segments (such as intro, verse, chorus, bridge, solo, and the like) of a musical composition are identified. Distance measures are used to evaluate whether each bar of a musical composition is more like the bars that directly precede it or more like the bars that directly succeed it, and each respective series of musically similar bars is assigned to the same respective segment. Large changes in the distance measure(s) between adjacent bars may be used to identify boundaries between abutting musical segments. Computer systems and computer program products for implementing segmentation are also described. The results of segmentation may advantageously be applied in computer-based composition of music and musical variations, as well as in other applications involving labelling, characterizing, or otherwise processing music.Type: GrantFiled: May 30, 2021Date of Patent: January 10, 2023Assignee: Obeebo Labs Ltd.Inventor: Colin P. Williams
-
Patent number: 11540038Abstract: The present disclosure may provide an acoustic device. The acoustic device may include a housing, at least one low-frequency acoustic driver, at least one high-frequency acoustic driver, and a noise reduction assembly. The housing may be configured to be rested on a shoulder of a user. The at least one low-frequency acoustic driver may be carried by the housing and configured to output first sound from at least two first sound guiding holes. The at least one high-frequency acoustic driver may be carried by the housing and configured to output second sound from at least two second sound guiding holes. The noise reduction assembly may be configured to receive third sound and reduce noise of the third sound.Type: GrantFiled: February 7, 2021Date of Patent: December 27, 2022Assignee: SHENZHEN SHOKZ CO., LTD.Inventors: Lei Zhang, Junjiang Fu, Bingyan Yan, Fengyun Liao, Xin Qi
-
Patent number: 11532302Abstract: Methods and devices for conducting, based on a clock difference, a synchronization process on voice information collected by a plurality of voice collection devices. Then, after the synchronization process is performed on the voice information collected by the plurality of voice collection devices, conducting a voice separation and recognition process on voice information that was collected by the plurality of voice collection devices and synchronized based on the clock difference among the plurality of voice collection devices.Type: GrantFiled: September 28, 2017Date of Patent: December 20, 2022Assignee: Harman International Industries, IncorporatedInventors: Xiangru Bi, Guoxia Zhang
-
Patent number: 11508387Abstract: Selecting audio noise reduction models for noise suppression in an information handling system (IHS), including performing calibration and configuration of an audio noise reduction selection model, including: identifying contextual data associated with contextual inputs to the IHS; training, based on the contextual data, the audio noise reduction selection model, including generating a configuration policy including configuration rules, the configuration rules for performing actions for selection of a combination of audio noise reduction models to reduce combinations of noise sources associated with the IHS; performing steady-state monitoring of the IHS, including: monitoring the contextual inputs of the IHS, and in response, accessing the audio noise reduction selection model, identifying configuration rules based on the monitored contextual inputs, applying the configuration rules to select a particular combination of audio noise reduction models, applying particular combination of audio noise reduction modType: GrantFiled: August 18, 2020Date of Patent: November 22, 2022Assignee: Dell Products L.P.Inventors: Vivek Viswanathan Iyer, Michael S. Gatson
-
Patent number: 11508376Abstract: The activities of multiple virtual personal assistant (VPA) applications are coordinated. For example, different portions of a conversational natural language dialog involving a user and a computing device may be handled by different VPAs.Type: GrantFiled: December 27, 2018Date of Patent: November 22, 2022Assignee: SRI InternationalInventors: Kenneth C. Nitz, Patrick D. Lincoln
-
Patent number: 11508349Abstract: A noise reduction method and apparatus for an on-board environment, an electronic device and a storage medium are provided, which are applicable to a field of computer technology, and particularly to a field of audio processing. The noise reduction method for an on-board environment includes: receiving an interference signal in the on-board environment and receiving a sound signal in the on-board environment, the interference signal comprising a vibration signal of a vehicle; and performing noise reduction processing on the sound signal in the on-board environment to obtain a noise-reduced signal; wherein, the noise reduction processing comprises cancelling the interference signal from the sound signal in the on-board environment.Type: GrantFiled: March 19, 2021Date of Patent: November 22, 2022Assignee: Beijing Baidu Netcom Science and Technology Co., LTDInventors: Zaidong Zhang, Zhanxue Li, Tingting Che
-
Patent number: 11495033Abstract: Embodiments of the present disclosure provides a method and an apparatus for controlling an unmanned vehicle, an electronic device and a computer readable storage medium. In this method, the computer device of the unmanned vehicle determines occurrence of an event associated with physical discomfort of a passenger in the vehicle. The computer device also determines a severity degree of the physical discomfort of the passenger. The computer device further controls a driving action of the vehicle based on the determined severity degree.Type: GrantFiled: October 30, 2019Date of Patent: November 8, 2022Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventor: Ya Wang
-
Patent number: 11495210Abstract: A method and system for detecting one or more speech features in speech audio data includes receiving speech audio data, performing preprocessing on the speech audio data to prepare the speech audio data for use as an input into one or more models that detect one or more speech features, providing the preprocessed speech audio data to a stacked machine learning model, and analyzing the preprocessed speech audio data via the stacked ML model to detect the one or more speech features. The stacked ML model includes a feature aggregation model, a sequence to sequence model, and a decision-making model.Type: GrantFiled: December 11, 2019Date of Patent: November 8, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Ji Li, Amit Srivastava
-
Patent number: 11494158Abstract: Augmented reality visual display of microphone pick-up patterns are disclosed. An example method includes capturing, via a camera of a computing device, an image of a microphone, and displaying the image on a display of the computing device. The method also includes determining, by the computing device, a location and orientation of the microphone relative to the camera, determining one or more parameters of a pick-up pattern of the microphone, determining a visual representation of the pick-up pattern based on the one or more parameters, and displaying the visual representation of the pick-up pattern overlaid on the image of the microphone.Type: GrantFiled: May 9, 2019Date of Patent: November 8, 2022Assignee: Shure Acquisition Holdings, Inc.Inventors: Christopher George Reiger, Mathew T. Abraham
-
Patent number: 11481187Abstract: Systems and methods are provided herein for responding to a voice command at a volume level based on a volume level of the voice command. For example, a media guidance application may detect, through a first voice-operated user device of a plurality of voice-operated user devices, a voice command spoken by a user. The media guidance application may determine a first volume level of the voice command. Based on the volume level of the voice command, the media guidance application may determine that a second voice-operated user device of the plurality of voice-operated user devices is closer to the user than any of the other voice-operated user devices. The media guidance application may generate an audible response, through the second voice-operated user device, at a second volume level that is set based on the first volume level of the voice command.Type: GrantFiled: January 9, 2020Date of Patent: October 25, 2022Assignee: ROVI GUIDES, INC.Inventors: Michael McCarty, Glen E. Roe
-
Patent number: 11465640Abstract: Techniques are described for cognitive analysis for directed control transfer for autonomous vehicles. In-vehicle sensors are used to collect cognitive state data for an individual within a vehicle which has an autonomous mode of operation. The cognitive state data includes infrared, facial, audio, or biosensor data. One or more processors analyze the cognitive state data collected from the individual to produce cognitive state information. The cognitive state information includes a subset or summary of cognitive state data, or an analysis of the cognitive state data. The individual is scored based on the cognitive state information to produce a cognitive scoring metric. A state of operation is determined for the vehicle. A condition of the individual is evaluated based on the cognitive scoring metric. Control is transferred between the vehicle and the individual based on the state of operation of the vehicle and the condition of the individual.Type: GrantFiled: December 28, 2018Date of Patent: October 11, 2022Assignee: Affectiva, Inc.Inventors: Rana el Kaliouby, Abdelrahman N. Mahmoud, Taniya Mishra, Andrew Todd Zeilman, Gabriele Zijderveld
-
Patent number: 11443220Abstract: To simplify assisting a user in their day-to-day activities, a communication for performing an action may be sent to a user in the form of a query, where the query includes the most likely set of choices for the action arranged in a group of dichotomous (e.g., yes/no) or multiple choice answers. In this manner, a user may respond to the query by simply selecting one of the dichotomous or multiple choice answers. Historical logs of past actions, responses, queries, and so forth, may be used to predict future user actions or needs, and to formulate future queries for sending to the user. These techniques may be implemented, for example, through a remote coordination server or directly through a user's personal electronics device.Type: GrantFiled: January 26, 2018Date of Patent: September 13, 2022Assignee: TELEPAHTY LABS, INC.Inventors: Damien Phelan Stolarz, David Joseph Diaz, James Rossfeld, Scott Raven, Christopher O'Malley, Christopher Kurpinski
-
Patent number: 11436511Abstract: To simplify assisting a user in their day-to-day activities, a communication for performing an action may be sent to a user in the form of a query, where the query includes the most likely set of choices for the action arranged in a group of dichotomous (e.g., yes/no) or multiple choice answers. In this manner, a user may respond to the query by simply selecting one of the dichotomous or multiple choice answers. Historical logs of past actions, responses, queries, and so forth, may be used to predict future user actions or needs, and to formulate future queries for sending to the user. These techniques may be implemented, for example, through a remote coordination server or directly through a user's personal electronics device.Type: GrantFiled: January 26, 2018Date of Patent: September 6, 2022Assignee: TELEPATHY LABS, INC.Inventors: Damien Phelan Stolarz, David Joseph Diaz, James Rossfeld, Scott Raven, Christopher O'Malley, Christopher Kurpinski
-
Patent number: 11423885Abstract: Techniques are described herein for selectively processing a user's utterances captured prior to and after an event that invokes an automated assistant to determine the user's intent and/or any parameters required for resolving the user's intent. In various implementations, respective measures of fitness for triggering responsive action by the automated assistant may be determined for pre-event and a post-event input streams. Based on the respective measures of fitness, one or both of the pre-event input stream or post-event input stream may be selected and used to cause the automated assistant to perform one or more responsive actions.Type: GrantFiled: February 20, 2019Date of Patent: August 23, 2022Assignee: GOOGLE LLCInventors: Matthew Sharifi, Tom Hume, Mohamad Hassan Mohamad Rom, Jan Althaus, Diego Melendo Casado
-
Patent number: 11410683Abstract: An electronic device includes a controller. The controller performs a voice recognition operation on a voice uttered by a person being monitored. The controller generates emotion information for the person being monitored, based on the voice recognition operation.Type: GrantFiled: September 3, 2018Date of Patent: August 9, 2022Assignee: KYOCERA CorporationInventors: Joji Yoshikawa, Yuki Yamada, Hiroshi Okamoto
-
Patent number: 11393492Abstract: A method for establishing a voice activity detection model includes obtaining a training audio file and a target result of the training audio file, framing the training audio file to obtain an audio frame, extracting an audio feature of the audio frame, the audio feature comprising at least two types of features, inputting the extracted audio feature as an input to a deep neural network model, performing information processing on the audio feature through a hidden layer of the deep neural network model, and outputting the processed audio feature through an output layer of the deep neural network model, to obtain a training result; determining a bias between the training result and the target result, and inputting the bias as an input to an error back propagation mechanism, and updating weights of the hidden layer until the deep neural network model reaches a preset condition.Type: GrantFiled: November 8, 2019Date of Patent: July 19, 2022Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LTDInventor: Haibo Liu
-
Patent number: 11386896Abstract: Systems and methods are disclosed. A digitized human vocal expression of a user and digital images are received over a network from a remote device. The digitized human vocal expression is processed to determine characteristics of the human vocal expression, including: pitch, volume, rapidity, a magnitude spectrum identify, and/or pauses in speech. Digital images are received and processed to detect characteristics of the user face, including detecting if one or more of the following is present: a sagging lip, a crooked smile, uneven eyebrows, and/or facial droop. Based at least on part on the human vocal expression characteristics and face characteristics, a determination is made as to what action is to be taken. A cepstrum pitch may be determined using an inverse Fourier transform of a logarithm of a spectrum of a human vocal expression signal. The volume may be determined using peak heights in a power spectrum of the human vocal expression.Type: GrantFiled: January 29, 2020Date of Patent: July 12, 2022Assignee: The Notebook, LLCInventor: Karen Elaine Khaleghi
-
Patent number: 11386904Abstract: Deterioration of voice extraction performance when positions of a plurality of microphones are changed is prevented. A signal processing device according to an embodiment of the present technology includes a voice extraction unit that performs voice extraction from signals of a plurality of microphones, in which the voice extraction unit uses, when respective positions of the plurality of microphones are changed to positions where other microphones have been present, respective signals of the plurality of microphones as signals of the other microphones. Thus, it is possible to cancel the effect of changing the positions of respective microphones on the voice extraction.Type: GrantFiled: March 19, 2019Date of Patent: July 12, 2022Assignee: Sony CorporationInventor: Kazuya Tateishi
-
Patent number: 11373670Abstract: Example embodiments disclosed herein relate to filter coefficient updating in time domain filtering. A method of processing an audio signal is disclosed. The method includes obtaining a predetermined number of target gains for a first portion of the audio signal by analyzing the first portion of the audio signal. Each of the target gains is corresponding to a subband of the audio signal. The method also includes determining filter coefficients for time domain filtering the first portion of the audio signal so as to approximate a frequency response given by the target gains. The filter coefficients are determined by iteratively selecting at least one target gain from the target gains and updating the filter coefficient based on the selected at least one target gain. Corresponding system and computer program product for processing an audio signal are also disclosed.Type: GrantFiled: May 6, 2019Date of Patent: June 28, 2022Assignee: Dolby Laboratories Licensing CorporationInventors: Dong Shi, Xuejing Sun
-
Patent number: 11375322Abstract: The present application relates to a hearing aid adapted to be worn in or at an ear of a hearing aid user and/or to be fully or partially implanted in the head of the hearing aid user.Type: GrantFiled: February 28, 2020Date of Patent: June 28, 2022Assignee: Oticon A/SInventors: Thomas Lunner, Lars Bramsløw
-
Patent number: 11350885Abstract: A method includes identifying, by an electronic device, one or more segments within a first audio recording that includes one or more non-speech segments and one or more speech segments. The method also includes generating, by the electronic device, one or more synthetic speech segments that include natural speech audio characteristics and that preserve one or more non-private features of the one or more speech segments. The method also includes generating, by the electronic device, an obfuscated audio recording by replacing the one or more speech segments with the one or more synthetic speech segments while maintaining the one or more non-speech segments, wherein the one or more synthetic speech segments prevent recognition of some content of the obfuscated audio recording.Type: GrantFiled: February 6, 2020Date of Patent: June 7, 2022Assignee: Samsung Electronics Co., Ltd.Inventors: Korosh Vatanparvar, Viswam Nathan, Ebrahim Nematihosseinabadi, Md Mahbubur Rahman, Jilong Kuang
-
Patent number: 11348575Abstract: A speaker recognition method and apparatus receives a first voice signal of a speaker, generates a second voice signal by enhancing the first voice signal through speech enhancement, generates a multi-channel voice signal by associating the first voice signal with the second voice signal, and recognizes the speaker based on the multi-channel voice signal.Type: GrantFiled: June 11, 2020Date of Patent: May 31, 2022Assignee: Samsung Electronics Co., Ltd.Inventors: Sung-Jae Cho, Kyuhong Kim, Jaejoon Han
-
Patent number: 11335350Abstract: An apparatus includes processor(s) to: perform pre-processing operations including derive an audio noise level of speech audio of a speech data set, derive a first relative weighting for first and second segmentation techniques for identifying likely sentence pauses in the speech audio based on the audio noise level, and select likely sentence pauses for a converged set of likely sentence pauses from likely sentence pauses identified by the first and/or second segmentation techniques based on the first relative weighting; and perform speech-to-text processing operations including divide the speech data set into data segments representing speech segments of the speech audio based on the converged set of likely sentence pauses, and derive a second relative weighting based on the audio noise level for selecting words indicated by an acoustic model or by a language model as being most likely spoken in the speech audio for inclusion in a transcript.Type: GrantFiled: October 12, 2021Date of Patent: May 17, 2022Assignee: SAS INSTITUTE INC.Inventors: Xiaolong Li, Xiaozhuo Cheng, Xu Yang
-
Patent number: 11328736Abstract: Disclosed are a method and an apparatus of denoising, and the method includes: receiving a first voice signal picked up by a microphone; if it is detected, with the first voice signal, that a sensor is in an operation state, subtracting an interference noise signal from the first voice signal to obtain a first voice signal with the interference removed therefrom, where the interference noise signal is an interference noise signal generated with regard to the microphone during an operation of the sensor, and the sensor and the microphone are packaged in one module; and outputting the first voice signal with the interference removed therefrom. By implementing the solution provided in the present disclosure, interference in a signal collected by a microphone when the microphone and a sensor in a module operate together is reduced and the small size of the module, packaged with the microphone and sensor, is guaranteed.Type: GrantFiled: August 28, 2017Date of Patent: May 10, 2022Assignee: WEIFANG GOERTEK MICROELECTRONICS CO., LTD.Inventors: Dexin Wang, Xiangju Xu, Luyu Duanmu
-
Patent number: 11327050Abstract: Disclosed herein are systems and methods for mechanical failure monitoring, detection, and classification in electronic assemblies. In some embodiments, a mechanical monitoring apparatus may include: a fixture to receive an electronic assembly; an acoustic sensor; and a computing device communicatively coupled to the acoustic sensor, wherein the acoustic sensor is to detect an acoustic emission waveform generated by a mechanical failure of the electronic assembly during testing.Type: GrantFiled: February 20, 2018Date of Patent: May 10, 2022Assignee: Intel CorporationInventors: Kyle Yazzie, Rajesh Kumar Neerukatti, Naga Sivakumar Yagnamurthy, David C. McCoy, Pramod Malatkar, Frank P. Prieto
-
Patent number: 11322168Abstract: A dual microphone signal processing arrangement for reducing reverberation is described. Time domain microphone signals are developed from a pair of sensing microphones. These are converted to the time-frequency domain to produce complex value spectra signals. A binary gain function applies frequency-specific energy ratios between the spectra signals to produce transformed spectra signals. A sigmoid gain function based on an inter-microphone coherence value between the transformed spectra signals is applied to the transformed spectra signals to produce coherence adapted spectra signals. And an inverse time-frequency transformation is applied to the coherence adjusted spectra signals to produce time-domain reverberation-compensated microphone signals with reduced reverberation components.Type: GrantFiled: August 9, 2019Date of Patent: May 3, 2022Assignee: MED-EL Elektromedizinische Geraete GmbHInventors: Kostas Kokkinakis, Joshua Stohl
-
Patent number: 11322138Abstract: A voice awakening method and device are provided. According to an embodiment, the method includes: receiving voice information of a user; obtaining an awakening confidence level corresponding to the voice information based on the voice information; determining, on the basis of the awakening confidence level, whether the voice information is suspected wake-up voice information; and performing, in response to determining the voice information being the suspected wake-up voice information, a secondary determination on the voice information to obtain a secondary determination result, and determining whether to perform a wake-up operation on the basis of the secondary determination result. The embodiment implements a secondary verification on the voice information, thereby reducing the probability that the smart device is mistakenly awakened.Type: GrantFiled: February 6, 2019Date of Patent: May 3, 2022Assignees: Baidu Online Network Technology (Beijing) Co., Ltd., Shanghai Xiaodu Technology Co., Ltd.Inventors: Jun Li, Rui Yang, Lifeng Zhao, Xiaojian Chen, Yushu Cao
-
Patent number: 11312164Abstract: A method is provided for extending the frequency band of an audio signal during a decoding or improvement process. The method includes obtaining the decoded signal in a first frequency band, referred to as a low band. Tonal components and a surround signal are extracted from the signal from the low-band signal, and the tonal components and the surround signal are combined by adaptive mixing using energy-level control factors to obtain an audio signal, referred to as a combined signal. The low-band decoded signal before the extraction step or the combined signal after the combination step are extended over at least one second frequency band which is higher than the first frequency band. Also proved are a frequency-band extension device which implements the described method and a decoder including a device of this type.Type: GrantFiled: July 13, 2020Date of Patent: April 26, 2022Assignee: Koninklijke Philips N.V.Inventors: Magdalena Kaniewska, Stephane Ragot
-
Patent number: 11308978Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for distributed automatic speech recognition. An example apparatus includes a detector to process an input audio signal and identify a portion of the input audio signal including a sound to be evaluated, the sound to be evaluated organized into a plurality of audio features representing the sound. The example apparatus includes a quantizer to process the audio features using a quantization process to reduce the audio features to generate a reduced set of audio features for transmission. The example apparatus includes a transmitter to transmit the reduced set of audio features over a low-energy communication channel for processing.Type: GrantFiled: August 5, 2019Date of Patent: April 19, 2022Assignee: INTEL CORPORATIONInventors: Binuraj K. Ravindran, Francis M. Tharappel, Prabhakar R. Datta, Tobias Bocklet, Maciej Muchlinski, Tomasz Dorau, Josef G. Bauer, Saurin Shah, Georg Stemmer
-
Patent number: 11308946Abstract: Methods and an apparatus for performing feature extraction on speech in a microphone signal with embedded noise processing to reduce the amount of processing are provided. In embodiments, feature extraction and the noise estimate use an output of the same Fourier Transform, such that the noise filtering of the speech is embedded with the feature extraction of the speech.Type: GrantFiled: July 25, 2019Date of Patent: April 19, 2022Assignee: Cerence Operating CompanyInventors: Jianzhong Teng, Xiao-Lin Ren, Xingui Zeng, Yi Gao
-
Patent number: 11302298Abstract: A signal processing method for an earphone includes: a motion state of a wearer of the earphone is detected by using an acceleration sensor arranged inside the earphone; a first microphone and a second microphone both arranged outside the earphone detect wind noise conditions corresponding to different frequency bands; and according to the motion state of the wearer of the earphone and the wind noise conditions corresponding to different frequency bands, operating modes of a feedforward filter and a feedback filter inside the earphone are adjusted, herein the feedforward filter and the feedback filter are configured for active noise cancellation of the earphone.Type: GrantFiled: February 19, 2021Date of Patent: April 12, 2022Assignee: Beijing Xiaoniao Tingting Technology Co., LTD.Inventors: Song Liu, Na Li, Bo Li
-
Patent number: 11295137Abstract: A system for exploiting visual information for enhancing audio signals via source separation and beamforming is disclosed. The system may obtain visual content associated with an environment of a user, and may extract, from the visual content, metadata associated with the environment. The system may determine a location of the user based on the extracted metadata. Additionally, the system may load, based on the location, an audio profile corresponding to the location of the user. The system may also load a user profile of the user that includes audio data associated with the user. Furthermore, the system may cancel, based on the audio profile and user profile, noise from the environment of the user. Moreover, the system may include adjusting, based on the audio profile and user profile, an audio signal generated by the user so as to enhance the audio signal during a communications session of the user.Type: GrantFiled: November 2, 2020Date of Patent: April 5, 2022Assignee: AT&T INIELLECTUAL PROPERTY I, L.P.Inventors: Dimitrios Dimitriadis, Donald J. Bowen, Lusheng Ji, Horst J. Schroeter
-
Patent number: 11289109Abstract: Embodiments of the disclosure provide systems and methods for audio signal processing. An exemplary system may include a communication interface configured to receiving a first audio signal acquired from an audio source through a first channel, and a second audio signal acquired from the same audio source through a second channel. The system may also include at least one processor coupled to the communication interface. The at least one processor may be configured to determine channel features based on the first audio signal and the second audio signal individually and determine a cross-channel feature based on the first audio signal and the second audio signal collectively. The at least one processor may further be configured to concatenate the channel features and the cross-channel feature and estimate spectral-spatial masks for the first channel and the second channel using the concatenated channel features and the cross-channel feature.Type: GrantFiled: April 24, 2020Date of Patent: March 29, 2022Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.Inventors: Chengyun Deng, Hui Song, Yi Zhang, Yongtao Sha
-
Patent number: 11282528Abstract: One embodiment provides a method, including: receiving, at an information handling device, user input comprising a potential wake word; determining, using a processor, whether the potential wake word is associated with a stored wake word; and responsive to determining that the potential wake word is associated with the stored wake word, activating, based on the potential wake word, a digital assistant associated with the information handling device. Other aspects are described and claimed.Type: GrantFiled: August 14, 2017Date of Patent: March 22, 2022Assignee: Lenovo (Singapore) Pte. Ltd.Inventors: Ryan Charles Knudson, Russell Speight VanBlon, Roderick Echols, Jonathan Gaither Knox
-
Patent number: 11270696Abstract: An audio device with at least one microphone adapted to receive sound from a sound field and create an output, and a processing system that is responsive to the output of the microphone. The processing system is configured to use a signal processing algorithm to detect a wakeup word, and modify the signal processing algorithm that is used to detect the wakeup word if the sound field changes.Type: GrantFiled: July 1, 2019Date of Patent: March 8, 2022Assignee: Bose CorporationInventors: Ricardo Carreras, Alaganandan Ganeshkumar
-
Patent number: 11257485Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.Type: GrantFiled: December 10, 2019Date of Patent: February 22, 2022Assignee: Google LLCInventors: Bo Li, Ron J. Weiss, Michiel A. U. Bacchiani, Tara N. Sainath, Kevin William Wilson
-
Patent number: 11257497Abstract: The present disclosure provides a voice wake-up processing method, an apparatus and a storage medium. After acquiring voice wake-up signals collected by audio input devices in at least two audio zones, an electronic device may correct, based on to-be-woken-up audio zones obtained from amplitudes of the voice wake-up signals collected by the audio input devices in the at least two audio zones, a to-be-woken-up audio zone identified using a voice engine, avoiding that audio zones in which a plurality of audio input devices collecting voice wake-up signals produced from a same user are located are all woken up, therefore, it is possible to improve accuracy of a voice wake-up result obtained by the electronic device. Therefore, the present disclosure can solve the technical problem that a vehicle-mounted terminal has low voice wake-up accuracy due to an insufficient degree of sound isolation between audio zones of the vehicle-mounted terminal.Type: GrantFiled: December 23, 2019Date of Patent: February 22, 2022Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Hanying Peng, Nengjun Ouyang
-
Patent number: 11257512Abstract: Systems and methods include a first voice activity detector operable to detect speech in a frame of a multichannel audio input signal and output a speech determination, a constrained minimum variance adaptive filter operable to receive the multichannel audio input signal and the speech determination and minimize a signal variance at the output of the filter, thereby producing an equalized target speech signal, a mask estimator operable to receive the equalized target speech signal and the speech determination and generate a spectral-temporal mask to discriminate a target speech from noise and interference speech, and a second activity voice detector operable to detect voice in a frame of the speech discriminated signal. An audio input sensor array including a plurality of microphones, each microphone generating a channel of the multichannel audio input signal. A sub-band analysis module operable to decompose each of the channels into a plurality of frequency sub-bands.Type: GrantFiled: January 6, 2020Date of Patent: February 22, 2022Assignee: SYNAPTICS INCORPORATEDInventors: Francesco Nesta, Alireza Masnadi-Shirazi
-
Patent number: 11257487Abstract: Techniques are described herein for enabling the use of “dynamic” or “context-specific” hot words to invoke an automated assistant. In various implementations, an automated assistant may be executed in a default listening state at least in part on a user's computing device(s). While in the default listening state, audio data captured by microphone(s) may be monitored for default hot words. Detection of the default hot word(s) transitions of the automated assistant into a speech recognition state. Sensor signal(s) generated by hardware sensor(s) integral with the computing device(s) may be detected and analyzed to determine an attribute of the user. Based on the analysis, the automated assistant may transition into an enhanced listening state in which the audio data may be monitored for enhanced hot word(s). Detection of enhanced hot word(s) triggers the automated assistant to perform a responsive action without requiring detection of default hot word(s).Type: GrantFiled: August 21, 2018Date of Patent: February 22, 2022Assignee: GOOGLE LLCInventor: Diego Melendo Casado
-
Patent number: 11250877Abstract: A method for generating a health indicator for at least one person of a group of people, the method comprising: receiving, at a processor, captured sound, where the captured sound is sound captured from the group of people; comparing the captured sound to a plurality of sound models to detect at least one non-speech sound event in the captured sound, each of the plurality of sound models associated with a respective health-related sound type; determining metadata associated with the at least one non-speech sound event; assigning the at least one non-speech sound event and the metadata to at least one person of the group of people; and outputting a message identifying the at least one non-speech event and the metadata to a health indicator generator module to generate a health indicator for the at least one person to whom the at least one non-speech sound event is assigned.Type: GrantFiled: July 25, 2019Date of Patent: February 15, 2022Assignee: AUDIO ANALYTIC LTDInventors: Christopher Mitchell, Joe Patrick Lynas, Sacha Krstulovic, Amoldas Jasonas, Julian Harris