Detect Speech In Noise Patents (Class 704/233)
-
Patent number: 11289109Abstract: Embodiments of the disclosure provide systems and methods for audio signal processing. An exemplary system may include a communication interface configured to receiving a first audio signal acquired from an audio source through a first channel, and a second audio signal acquired from the same audio source through a second channel. The system may also include at least one processor coupled to the communication interface. The at least one processor may be configured to determine channel features based on the first audio signal and the second audio signal individually and determine a cross-channel feature based on the first audio signal and the second audio signal collectively. The at least one processor may further be configured to concatenate the channel features and the cross-channel feature and estimate spectral-spatial masks for the first channel and the second channel using the concatenated channel features and the cross-channel feature.Type: GrantFiled: April 24, 2020Date of Patent: March 29, 2022Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.Inventors: Chengyun Deng, Hui Song, Yi Zhang, Yongtao Sha
-
Patent number: 11282528Abstract: One embodiment provides a method, including: receiving, at an information handling device, user input comprising a potential wake word; determining, using a processor, whether the potential wake word is associated with a stored wake word; and responsive to determining that the potential wake word is associated with the stored wake word, activating, based on the potential wake word, a digital assistant associated with the information handling device. Other aspects are described and claimed.Type: GrantFiled: August 14, 2017Date of Patent: March 22, 2022Assignee: Lenovo (Singapore) Pte. Ltd.Inventors: Ryan Charles Knudson, Russell Speight VanBlon, Roderick Echols, Jonathan Gaither Knox
-
Patent number: 11270696Abstract: An audio device with at least one microphone adapted to receive sound from a sound field and create an output, and a processing system that is responsive to the output of the microphone. The processing system is configured to use a signal processing algorithm to detect a wakeup word, and modify the signal processing algorithm that is used to detect the wakeup word if the sound field changes.Type: GrantFiled: July 1, 2019Date of Patent: March 8, 2022Assignee: Bose CorporationInventors: Ricardo Carreras, Alaganandan Ganeshkumar
-
Patent number: 11257497Abstract: The present disclosure provides a voice wake-up processing method, an apparatus and a storage medium. After acquiring voice wake-up signals collected by audio input devices in at least two audio zones, an electronic device may correct, based on to-be-woken-up audio zones obtained from amplitudes of the voice wake-up signals collected by the audio input devices in the at least two audio zones, a to-be-woken-up audio zone identified using a voice engine, avoiding that audio zones in which a plurality of audio input devices collecting voice wake-up signals produced from a same user are located are all woken up, therefore, it is possible to improve accuracy of a voice wake-up result obtained by the electronic device. Therefore, the present disclosure can solve the technical problem that a vehicle-mounted terminal has low voice wake-up accuracy due to an insufficient degree of sound isolation between audio zones of the vehicle-mounted terminal.Type: GrantFiled: December 23, 2019Date of Patent: February 22, 2022Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Hanying Peng, Nengjun Ouyang
-
Patent number: 11257512Abstract: Systems and methods include a first voice activity detector operable to detect speech in a frame of a multichannel audio input signal and output a speech determination, a constrained minimum variance adaptive filter operable to receive the multichannel audio input signal and the speech determination and minimize a signal variance at the output of the filter, thereby producing an equalized target speech signal, a mask estimator operable to receive the equalized target speech signal and the speech determination and generate a spectral-temporal mask to discriminate a target speech from noise and interference speech, and a second activity voice detector operable to detect voice in a frame of the speech discriminated signal. An audio input sensor array including a plurality of microphones, each microphone generating a channel of the multichannel audio input signal. A sub-band analysis module operable to decompose each of the channels into a plurality of frequency sub-bands.Type: GrantFiled: January 6, 2020Date of Patent: February 22, 2022Assignee: SYNAPTICS INCORPORATEDInventors: Francesco Nesta, Alireza Masnadi-Shirazi
-
Patent number: 11257485Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.Type: GrantFiled: December 10, 2019Date of Patent: February 22, 2022Assignee: Google LLCInventors: Bo Li, Ron J. Weiss, Michiel A. U. Bacchiani, Tara N. Sainath, Kevin William Wilson
-
Patent number: 11257487Abstract: Techniques are described herein for enabling the use of “dynamic” or “context-specific” hot words to invoke an automated assistant. In various implementations, an automated assistant may be executed in a default listening state at least in part on a user's computing device(s). While in the default listening state, audio data captured by microphone(s) may be monitored for default hot words. Detection of the default hot word(s) transitions of the automated assistant into a speech recognition state. Sensor signal(s) generated by hardware sensor(s) integral with the computing device(s) may be detected and analyzed to determine an attribute of the user. Based on the analysis, the automated assistant may transition into an enhanced listening state in which the audio data may be monitored for enhanced hot word(s). Detection of enhanced hot word(s) triggers the automated assistant to perform a responsive action without requiring detection of default hot word(s).Type: GrantFiled: August 21, 2018Date of Patent: February 22, 2022Assignee: GOOGLE LLCInventor: Diego Melendo Casado
-
Patent number: 11250877Abstract: A method for generating a health indicator for at least one person of a group of people, the method comprising: receiving, at a processor, captured sound, where the captured sound is sound captured from the group of people; comparing the captured sound to a plurality of sound models to detect at least one non-speech sound event in the captured sound, each of the plurality of sound models associated with a respective health-related sound type; determining metadata associated with the at least one non-speech sound event; assigning the at least one non-speech sound event and the metadata to at least one person of the group of people; and outputting a message identifying the at least one non-speech event and the metadata to a health indicator generator module to generate a health indicator for the at least one person to whom the at least one non-speech sound event is assigned.Type: GrantFiled: July 25, 2019Date of Patent: February 15, 2022Assignee: AUDIO ANALYTIC LTDInventors: Christopher Mitchell, Joe Patrick Lynas, Sacha Krstulovic, Amoldas Jasonas, Julian Harris
-
Patent number: 11250038Abstract: An interactive question and answer (Q&A) service provides pairs of questions and corresponding answers related to the content of a web page. The service includes pre-configured Q&A pairs derived from a deep learning framework that includes a series of neural networks trained through joint and transfer learning to generate questions for a given text passage. In addition, pre-configured Q&A pairs are generated from historical web access patterns and sources related to the content of the web page.Type: GrantFiled: August 13, 2018Date of Patent: February 15, 2022Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.Inventors: Payal Bajaj, Gearard Boland, Anshul Gupta, Matthew Glenn Jin, Eduardo Enrique Noriega De Armas, Jason Shaver, Neelakantan Sundaresan, Roshanak Zilouchian Moghaddam
-
Patent number: 11240609Abstract: An audio device that includes a music classifier that determines when music is present in an audio signal is disclosed. The audio device is configured to receive audio, process the received audio, and to output the processed audio to a user. The processing may be adjusted based on the output of the music classifier. The music classifier utilizes a plurality of decision making units, each operating on the received audio independently. The decision making units are simplified to reduce the processing, and therefore the power, necessary for operation. Accordingly each decision making unit may be insufficient to determine music alone but in combination may accurately detect music while consuming power at a rate that is suitable for a mobile device, such as a hearing aid.Type: GrantFiled: June 3, 2019Date of Patent: February 1, 2022Assignee: SEMICONDUCTOR COMPONENTS INDUSTRIES, LLCInventors: Pejman Dehghani, Robert L. Brennan
-
Patent number: 11238883Abstract: A method and a system for dialogue enhancement of an audio signal, comprising receiving (step S1) the audio signal and a text content associated with dialogue occurring in the audio signal, generating (step S2) parameterized synthesized speech from the text content, and applying (step S3) dialogue enhancement to the audio signal based on the parameterized synthesized speech. With the invention text captions, subtitles, or other forms of text content included in an audio stream, can be used to significantly improve dialogue enhancement on the playback side.Type: GrantFiled: May 23, 2019Date of Patent: February 1, 2022Assignee: Dolby Laboratories Licensing CorporationInventors: Timothy Alan Port, Winston Chi Wai Ng, Mark William Gerrard
-
Patent number: 11222625Abstract: Systems and methods for training a control panel to recognize user defined and preprogrammed sound patterns are provided. Such systems and methods can include the control panel operating in a learning mode, receiving initial ambient audio from a region, and saving the initial ambient audio as an audio pattern in a memory device of the control panel. Such systems and methods can also include the control panel operating in an active mode, receiving subsequent ambient audio from the region, using an audio classification model to make an initial determination as to whether the subsequent ambient audio matches or is otherwise consistent with the audio pattern, determining whether the initial determination is correct, and when the control panel determines that the initial determination is incorrect, modifying or updating the audio classification model for improving the accuracy in detecting future consistency with the audio pattern.Type: GrantFiled: April 15, 2019Date of Patent: January 11, 2022Assignee: Ademco Inc.Inventors: Pradyumna Sampath, Ramprasad Yelchuru, Purnaprajna R. Mangsuli
-
Patent number: 11222624Abstract: A server may provide a voice recognition service. The server may include a memory configured for storing a plurality of voice recognition models, a communication device configured for communicating a plurality of voice recognition devices, and an artificial intelligence device configured for providing a voice recognition service to the plurality of voice recognition devices, acquiring use-related information regarding a first voice recognition device (from among the plurality of voice recognition devices), and changing a voice recognition model corresponding to the first voice recognition device from a first voice recognition model to a second voice recognition model based on the use-related information.Type: GrantFiled: August 20, 2019Date of Patent: January 11, 2022Assignee: LG ELECTRONICS INC.Inventors: Jaehong Kim, Hangil Jeong
-
Patent number: 11222654Abstract: A method for voice detection, the method may include (a) generating an in-ear signal that represents a signal sensed by an in-ear microphone and fed to a feedback active noise cancellation (ANC) circuit; (b) generating at least one additional signal, based on at least one out of a playback signal and a pickup signal sensed by a voice pickup microphone; and (c) generating a voice indicator based on the in-ear signal and the at least one additional signal.Type: GrantFiled: January 13, 2020Date of Patent: January 11, 2022Assignee: DSP GROUP LTD.Inventors: Assaf Ganor, Ori Elyada
-
Patent number: 11205411Abstract: A method for processing audio signal includes that: audio signals emitted respectively from at least two sound sources are acquired through at least two microphones to obtain respective original noisy signals of the at least two microphones; sound source separation is performed on the respective original noisy signals of the at least two microphones to obtain respective time-frequency estimated signals of the at least two sound sources; a mask value of the time-frequency estimated signal of each sound source in the original noisy signal of each microphone is determined based on the respective time-frequency estimated signals; the respective time-frequency estimated signals of the at least two sound sources are updated based on the respective original noisy signals of the at least two microphones and the mask values; and the audio signals emitted respectively from the at least two sound sources are determined.Type: GrantFiled: May 29, 2020Date of Patent: December 21, 2021Assignee: Beijing Xiaomi Intelligent Technology Co., Ltd.Inventor: Haining Hou
-
Patent number: 11204736Abstract: The systems and methods described relate to the concept that smart devices can be used to 1) sense various types of phenomena like sound, blue light exposure, RF and microwave radiation, and 2) in real-time analyze, report and/or control outputs (e.g., displays or speakers). The systems are configurable and use standard computing devices, such as wearable electronics, tablet computers, and mobile phones to measure various frequency bands across multiple points, allowing a single user to visualize and/or adjust environmental conditions.Type: GrantFiled: October 17, 2019Date of Patent: December 21, 2021Assignee: ZOPHONOS INC.Inventor: Levaughn Denton
-
Patent number: 11189303Abstract: A multi-microphone algorithm for detecting and differentiating interference sources from desired talker speech in advanced audio processing for smart home applications is described. The approach is based on characterizing a persistent interference source when sounds repeated occur from a fixed spatial location relative to the device, which is also fixed. Some examples of such interference sources include TV, music system, air-conditioner, washing machine, and dishwasher. Real human talkers, in contrast, are not expected to remain stationary and speak continuously from the same position for a long time. The persistency of an acoustic source is established based on identifying historically-recurring inter-microphone frequency-dependent phase profiles in multiple time periods of the audio data. The detection algorithm can be used with a beamforming processor to suppress the interference and for achieving voice quality and automatic speech recognition rate improvements in smart home applications.Type: GrantFiled: September 25, 2017Date of Patent: November 30, 2021Assignee: Cirrus Logic, Inc.Inventors: Narayan Kovvali, Seth Suppappola
-
Patent number: 11188718Abstract: A collective emotional engagement detection arrangement is provided for determining emotions of users in group conversations. A computer-implemented method includes determining a first conversation velocity of communications through conversation channels over a first time period for a group discussion between user computers; determining that a conversation velocity of the communications has increased to a second conversation velocity of communications which exceeds a predetermined threshold, and has remained above the predetermined threshold for at least a second time period; determining, aggregated emotions of the users during the second time period; and providing an output to a moderator of the group discussion indicating that the second conversation velocity of the communications has exceeded the predetermined threshold for at least the second time period, and indicating the aggregated emotions of the users during the second time period.Type: GrantFiled: September 27, 2019Date of Patent: November 30, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ilse M. Breedvelt-Schouten, John A. Lyons, Jana H. Jenkins, Jeffrey A. Kusnitz
-
Patent number: 11184244Abstract: The current document is directed to methods and systems that employ network metrics collected by distributed-computer-system metrics-collection services to determine a service-call-based topology for distributed service-oriented applications. In a described implementation, network metrics are collected over a number of network-metric monitoring periods. Independent component analysis is used to extract, from the collected network metrics, signals corresponding to sequences of service calls initiated by calls to the application-programming interface of a distributed service-oriented application. The signals, in combination with call traces obtained from a distributed-services call-tracing utility or service, are then used to construct representations of distributed-service-oriented-application topologies. The distributed-service-oriented-application topologies provide a basis for any additional types of distributed-computer-system functionalities, utilities, and facilities.Type: GrantFiled: February 19, 2020Date of Patent: November 23, 2021Assignee: VMware, Inc.Inventors: Susobhit Panigrahi, Reghuram Vasanthakumari, Arihant Jain
-
Patent number: 11164591Abstract: A speech enhancement method includes determining a first spectral subtraction parameter based on a power spectrum of a speech signal containing noise and a power spectrum of a noise signal, determining a second spectral subtraction parameter based on the first spectral subtraction parameter and a reference power spectrum, and performing, based on the power spectrum of the noise signal and the second spectral subtraction parameter, spectral subtraction on the speech signal containing noise, where the reference power spectrum includes a predicted user speech power spectrum and/or predicted environmental noise power. Regularity of a power spectrum feature of a user speech of a terminal device and/or regularity of a power spectrum feature of noise in an environment in which a user is located are considered.Type: GrantFiled: January 18, 2018Date of Patent: November 2, 2021Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Weixiang Hu, Lei Miao
-
Patent number: 11150869Abstract: Aspects of the present disclosure relate to voice command filtering. One or more directions of background noise for a location of a voice command device are determined. The one or more directions of background noise are stored as one or more blocked directions. A voice input is received at the location of the voice command device. A direction the voice input is being received from is determined and compared to the one or more blocked directions. The voice input is ignored in response to the direction of the voice input being received from corresponding to a direction of the one or more blocked directions, unless the received voice input is in a recognized voice.Type: GrantFiled: February 14, 2018Date of Patent: October 19, 2021Assignee: International Business Machines CorporationInventors: Eunjin Lee, Daniel Cunnington, John J. Wood, Giacomo G. Chiarella
-
Patent number: 11146298Abstract: A signal generator device includes a digital signal waveform generator to produce a digital signal waveform, a first frequency band signal path having a first frequency band filter to receive the digital signal waveform and to pass first frequency band components of the digital signal waveform, and a first digital-to-analog converter to receive the first frequency band components of the digital signal waveform and to produce a first frequency band analog signal, a second frequency band signal path having a second frequency band filter to receive the digital signal waveform and to pass second frequency band components of the digital signal waveform, a second digital-to-analog converter to receive the second frequency band components of the digital signal waveform and to produce a second frequency band analog signal, and a combining element to combine the first frequency band analog signal and the second frequency band analog signal to produce a wideband analog signal.Type: GrantFiled: September 30, 2019Date of Patent: October 12, 2021Assignee: Tektronix, Inc.Inventor: Gregory A. Martin
-
Patent number: 11146907Abstract: A system for identifying the contribution of a given sound source to a composite audio track, the system comprising an audio input unit operable to receive an input composite audio track comprising two or more sound sources, including the given sound source, an audio generation unit operable to generate, using a model of a sound source, an approximation of the contribution of the given sound source to the composite audio track, an audio comparison unit operable to compare the generated audio to at least a portion of the composite audio track to determine whether the generated audio provides an approximation of the composite audio track that meets a threshold degree of similarity, and an audio identification unit operable to identify, when the threshold is met, the generated audio as a suitable representation of the contribution of the sound source to the composite audio track.Type: GrantFiled: April 3, 2020Date of Patent: October 12, 2021Assignee: Sony Interactive Entertainment Inc.Inventors: Fabio Cappello, Oliver Hume
-
Patent number: 11138992Abstract: This application discloses a voice activity detection method. The method includes receiving speech data, the speech data including a multi-frame speech signal; determining energy and spectral entropy of a frame of speech signal; calculating a square root of the energy of the speech signal and/or calculating a square root of the spectral entropy of the frame of the speech signal; determining a spectral entropy-energy square root of the frame of the speech signal based on at least one of the square root of the energy and the square root of the spectral entropy; and determining that the frame of the speech signal is an unvoiced frame if the spectral entropy-energy square root of the speech signal is less than a first threshold, or that it is a voiced frame if the spectral entropy-energy square root of the speech signal is greater than or equal to the first threshold.Type: GrantFiled: October 28, 2019Date of Patent: October 5, 2021Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventor: Jizhong Liu
-
Patent number: 11132998Abstract: A voice recognition device includes: a first feature vector calculating unit (2) for calculating a first feature vector from voice data input; an acoustic likelihood calculating unit (4) for calculating an acoustic likelihood of the first feature vector by using an acoustic model used for calculating an acoustic likelihood of a feature vector; a second feature vector calculating unit (3) for calculating a second feature vector from the voice data; a noise degree calculating unit (6) for calculating a noise degree of the second feature vector by using a discriminant model used for calculating a noise degree indicating whether a feature vector is noise or voice; a noise likelihood recalculating unit (8) for recalculating an acoustic likelihood of noise on the basis of the acoustic likelihood of the first feature vector and the noise degree of the second feature vector; and a collation unit (9) for performing collation with a pattern of a vocabulary word to be recognized, by using the acoustic likelihood calculaType: GrantFiled: March 24, 2017Date of Patent: September 28, 2021Assignee: MITSUBISHI ELECTRIC CORPORATIONInventors: Toshiyuki Hanazawa, Tomohiro Narita
-
Patent number: 11122367Abstract: In a method for controlling an audio system of a vehicle, an intention to communicate and/or a voice of at least one of a specific, occupant of the vehicle and/or of an occupant on a specific seat of the vehicle, are/is sensed, and at least one audio signal of the vehicle is changed as a function of the sensed intention to communicate and/or the voice of the occupant.Type: GrantFiled: October 2, 2019Date of Patent: September 14, 2021Assignee: Bayerische Motoren Werke AktiengesellschaftInventor: Alexander Augst
-
Patent number: 11114108Abstract: A method includes extracting, from multiple microphone input, a hyperset of features of acoustic sources, using the extracted features to identify separable clusters associated with acoustic scenarios, and classifying subsequent input as one of the acoustic scenarios using the hyperset of features. The acoustic scenarios include a desired spatially moving/non-moving talker, and an undesired spatially moving/non-moving acoustic source. The hyperset of features includes both spatial and voice biometric features. The classified acoustic scenario may be used in a robotics application or voice assistant device desired speech enhancement or interference signal cancellation. Specifically, the classification of the acoustic scenarios can be used to adapt a beamformer, e.g., step size adjustment. The hyperset of features may also include visual biometric features extracted from one or more cameras viewing the acoustic sources.Type: GrantFiled: May 11, 2020Date of Patent: September 7, 2021Assignee: Cirrus Logic, Inc.Inventors: Ghassan Maalouli, Samuel P. Ebenezer
-
Patent number: 11114089Abstract: A method, system, and computer program product for applying a profile to an assistive device based on a multitude of cues includes: gathering audio inputs surrounding an assistive device; analyzing, by the assistive device, the audio inputs; determining, based on the analyzing, scenario cues; classifying a current environment surrounding the assistive device from the scenario cues; comparing the current environment to device profiles of the assistive device; determining, based on the comparing, a matching profile; and, in response to determining the matching profile, executing the matching profile on the assistive device.Type: GrantFiled: November 19, 2018Date of Patent: September 7, 2021Assignee: International Business Machines CorporationInventors: Matthew Chapman, Chengxuan Xing, Andrew J. Daniel, Ashley Harrison
-
Patent number: 11114093Abstract: An intelligent voice recognition method, voice recognition apparatus and intelligent computing device are disclosed. An intelligent voice recognition method according to an embodiment of the present invention obtains a microphone detection signal, recognizes a voice of a user from the microphone detection signal and outputs a response related to the voice on the basis of a result of recognition of the voice, wherein the microphone detection signal includes noise, and a microphone detection signal including only the voice obtained by removing the noise from the microphone detection signal is recognized. Accordingly, only a voice of a user can be effectively separated from a microphone detection signal detected through a microphone of the voice recognition apparatus.Type: GrantFiled: August 29, 2019Date of Patent: September 7, 2021Assignee: LG ELECTRONICS INC.Inventor: Wonchul Kim
-
Sound analysis apparatus, sound analysis method, and non-transitory computer readable storage medium
Patent number: 11094336Abstract: A sound analysis apparatus includes a sound acquirer configured to acquire a sound signal, a measurer configured to output time-series data of numerical values representing volumes based on the sound signal, and a calculator configured to perform calculation for analyzing the time-series data output from the measurer, wherein the calculator performs the calculation in a case of a first state in which a measured value that is the numerical value output from the measurer is included within an analysis target range that is a range in which the measured value is determined to be an analysis target, and wherein the calculator does not perform the calculation in a case of a second state in which the measured value is not included within the analysis target range.Type: GrantFiled: August 15, 2019Date of Patent: August 17, 2021Assignee: Yokogawa Electric CorporationInventors: Yuko Ito, Hiroki Yoshino -
Patent number: 11089404Abstract: A sound processing apparatus includes n number of microphones that are disposed correspondingly to n number of persons and that mainly collect sound signals uttered by respective relevant persons, a filter that suppresses crosstalk components included in a talker sound signal collected by a microphone corresponding to at least one talker using the sound signals collected by the n number of microphones, a parameter updater that updates a parameter of the filter for suppressing the crosstalk components and stores an update result in the memory in a case where a predetermined condition including time at which at least one talker talks is satisfied, and a sound output controller that outputs the sound signals, acquired by subtracting the crosstalk components by the filter from the talker sound signals based on the update result, from a speaker.Type: GrantFiled: January 24, 2020Date of Patent: August 10, 2021Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.Inventors: Masanari Miyamoto, Hiromasa Ohashi, Naoya Tanaka
-
Patent number: 11074927Abstract: A computer implemented method, computer system and computer program product are provided for acoustic event detection in polyphonic acoustic data, according to the method, polyphonic acoustic data is inputted by one or more processing units into a trained neural network trained by labeled monophonic acoustic data, a first output from a hidden layer of the trained neural network is obtained by one or more processing units, and at least one acoustic classification of the polyphonic acoustic data is determined by one or more processing units based on the first output and a feature dictionary learnt from the trained neural network.Type: GrantFiled: October 31, 2017Date of Patent: July 27, 2021Assignee: International Business Machines CorporationInventors: Xiao Xing Liang, Ning Zhang, Yu Ling Zheng, Yu Chen Zhou
-
Patent number: 11069352Abstract: Described herein is a system for media presence detection in audio. The system analyzes audio data to recognize whether a given audio segment contains sounds from a media source as a way of differentiating recorded media source sounds from other live sounds. In exemplary embodiments, the system includes a hierarchical model architecture for processing audio data segments, where individual audio data segments are processed by a trained machine learning model operating locally, and another trained machine learning model provides historical and contextual information to determine a score indicating the likelihood that the audio data segment contains sounds from a media source.Type: GrantFiled: February 18, 2019Date of Patent: July 20, 2021Assignee: Amazon Technologies, Inc.Inventors: Qingming Tang, Ming Sun, Chieh-Chi Kao, Chao Wang, Viktor Rozgic
-
Patent number: 11069353Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.Type: GrantFiled: May 6, 2019Date of Patent: July 20, 2021Assignee: Amazon Technologies, Inc.Inventors: Yixin Gao, Ming Sun, Jason Krone, Shiv Naga Prasad Vitaladevuni, Yuzong Liu
-
Patent number: 11062708Abstract: A method and an apparatus for dialoguing based on a mood of a user, where the method includes: collecting first audio data from the user, determining the mood of the user according to a feature of the first audio data, and dialoguing with the user using second audio data corresponding to the mood of the user. The method and the apparatus for dialoguing based on the mood of the user provided by the present disclosure may make different responses according to the mood of the user when dialoguing with the user. Therefore, it further enriches response that the electronic device may make according to voice data of the user, and further improves the user experience during dialoguing with the electronic device.Type: GrantFiled: July 12, 2019Date of Patent: July 13, 2021Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Li Xu, Yingchao Li, Xiaoxin Ma
-
Patent number: 11036305Abstract: There is provided a signal processing apparatus that includes a control unit that executes, on a basis of a waveform signal generated in accordance with a motion of an attachment portion of a sensor attached to a tool or a body, effect processing for the waveform signal or another waveform signal, the waveform signal being output from the sensor.Type: GrantFiled: June 4, 2020Date of Patent: June 15, 2021Assignee: SONY CORPORATIONInventors: Heesoon Kim, Masaharu Yoshino, Tatsushi Nashida, Masahiko Inami, Kouta Minamizawa, Yuta Sugiura, Yusuke Mizushina
-
Patent number: 11031008Abstract: A terminal device is provided. The terminal device includes a communication interface, and a processor configured to receive performance information of one or more other terminal devices from each of the one or more other terminal devices, identify an edge device to perform voice recognition based on the performance information received from each of the one or more other terminal devices, based on the terminal device being identified as the edge device, receive information associated with reception quality from one or more other terminal devices which receive a sound wave including a triggering word, determine a terminal device to acquire the sound wave for voice recognition from based on the received information associated with the reception quality, and transmit, to the determined terminal device, a command to transmit the sound wave acquired for voice recognition to an external voice recognition device.Type: GrantFiled: April 10, 2019Date of Patent: June 8, 2021Assignee: Samsung Electronics Co., Ltd.Inventor: Minseok Kim
-
Patent number: 11024331Abstract: Systems and methods for optimizing voice detection via a network microphone device are disclosed herein. In one example, individual microphones of a network microphone device detect sound. The sound data is captured in a first buffer and analyzed to detect a trigger event. Metadata associated with the sound data is captured in a second buffer and provided to at least one network device to determine at least one characteristic of the detected sound based on the metadata. The network device provides a response that includes an instruction, based on the determined characteristic, to modify at least one performance parameter of the NMD. The NMD then modifies the at least one performance parameter based on the instruction.Type: GrantFiled: September 21, 2018Date of Patent: June 1, 2021Assignee: Sonos, Inc.Inventors: Connor Kristopher Smith, Kurt Thomas Soto, Charles Conor Sleith
-
Patent number: 11024274Abstract: Systems, devices, and methods for segmenting musical compositions are described. Discrete, musically-coherent segments (such as intro, verse, chorus, bridge, solo, and the like) of a musical composition are identified. Distance measures are used to evaluate whether each bar of a musical composition is more like the bars that directly precede it or more like the bars that directly succeed it, and each respective series of musically similar bars is assigned to the same respective segment. Large changes in the distance measure(s) between adjacent bars may be used to identify boundaries between abutting musical segments. Computer systems and computer program products for implementing segmentation are also described. The results of segmentation may advantageously be applied in computer-based composition of music and musical variations, as well as in other applications involving labelling, characterizing, or otherwise processing music.Type: GrantFiled: January 28, 2020Date of Patent: June 1, 2021Assignee: Obeebo Labs Ltd.Inventor: Colin P. Williams
-
Patent number: 10997967Abstract: A method for initializing a device for performing acoustic speech recognition (ASR) using an ASR model, by a computer system including at least one processor and a system memory element. The method includes obtaining a plurality of voice data articulations of predetermined phrases, by the at least one processor via a user interface. The plurality of voice data articulations includes a first quantity of audio samples of actual articulated voice data, and each of the plurality of voice data articulations includes one of the audio samples including acoustic frequency components. The method further includes performing a plurality of augmentations to the plurality of voice data articulations of predetermined phrases, to generate a corpus audio data set that includes the first quantity of audio samples and a second quantity of audio samples including augmented versions of the first quantity of audio samples.Type: GrantFiled: April 18, 2019Date of Patent: May 4, 2021Assignee: HONEYWELL INTERNATIONAL INC.Inventors: Luning Wang, Wei Yang, Zhiyong Dai
-
Patent number: 10972834Abstract: This disclosure describes techniques for detecting voice commands from a user of an ear-based device. The ear-based device may include an in-ear facing microphone to capture sound emitted in an ear of the user, and an exterior facing microphone to capture sound emitted in an exterior environment of the user. The in-ear microphone may generate an inner audio signal representing the sound emitted in the ear, and the exterior microphone may generate an outer audio signal representing sound from the exterior environment. The ear-based device may compute a ratio of a power of the inner audio signal to the outer audio signal and may compare this ratio to a threshold. If the ratio is larger than the threshold, the ear-based device may detect the voice of the user. Further, the ear-based device may set a value of the threshold based on a level of acoustic seal of the ear-based device.Type: GrantFiled: February 11, 2020Date of Patent: April 6, 2021Assignee: Amazon Technologies, Inc.Inventors: Kuan-Chieh Yen, Daniel Wayne Harris, Carlo Murgia, Taro Kimura
-
Patent number: 10956734Abstract: An electronic device and a method of operating the electronic device are provided. The electronic device includes a proximity detector; an iris recognition module; a memory; and a processor electrically connected to the proximity detector, the iris recognition module, and the memory, wherein the processor is configured to execute an iris recognition operation based on the iris recognition module; determine proximity of an object based on the proximity detector while the iris recognition operation is performed; and, if the proximity of the object includes within a set reference range, stop the iris recognition operation.Type: GrantFiled: July 10, 2017Date of Patent: March 23, 2021Inventors: Hyung-Woo Shin, Hyemi Lee, Hyung Min Lee
-
Patent number: 10958468Abstract: A portable acoustic unit is adapted for insertion into an electrical receptacle. The portable acoustic unit has an integrated microphone and a wireless network interface to an automation controller. The portable acoustic unit detects spoken voice commands from users in the vicinity of the electrical receptacle. The portable acoustic unit merely plugs into a conventional electrical outlet to provide an extremely simple means of voice control through a home or business.Type: GrantFiled: August 29, 2018Date of Patent: March 23, 2021Assignee: AT&T INTELLECTUAL PROPERTY I, L. P.Inventors: Nafiz Haider, Ross Newman, Kristin Patterson, Thomas Risley, Curtis Stephenson, David Vaught
-
Patent number: 10950243Abstract: A system and method for improving T-matrix training for speaker recognition are provided. The method includes receiving an audio input, divisible into a plurality of audio frames, wherein at least a first audio frame includes an audio sample of a human speaker, the sample having a length above a first threshold; generating for each audio frame a feature vector; generating for a first plurality of feature vectors centered statistics of at least a zero order and a first order; generating a first i-vector, the first i-vector representing the human speaker; generating an optimized T-matrix training sequence computation, based on the first i-vector, an initialized T-matrix, the centered statistics, and a Gaussian mixture model (GMM) of a trained universal background model (UBM).Type: GrantFiled: March 1, 2019Date of Patent: March 16, 2021Assignee: ILLUMA Labs Inc.Inventor: Milind Borkar
-
Patent number: 10951996Abstract: A binaural hearing system includes a first hearing device and a second hearing device, each of which comprising: an input transducer; a transducer audio signal processor configured to provide a processed input transducer audio signal; an ear canal microphone; an ear canal audio signal processor configured to provide a processed ear canal audio signal; a first signal combiner configured to combine the processed input transducer audio signal with the processed ear canal audio signal to obtain an output transducer audio signal; a signal level detector configured to determine a signal level of (1) the output transducer audio signal or (2) an audio signal included in formation of the output transducer audio signal; and an output transducer; wherein the binaural hearing system further comprises a binaural excessive level detector connected to the first hearing device's signal level detector and the second hearing device's signal level detector.Type: GrantFiled: June 17, 2019Date of Patent: March 16, 2021Assignee: GN Hearing A/SInventors: Søren Christian Voigt Pedersen, Jonathan Boley, James Robert Anderson
-
Patent number: 10943598Abstract: Methods and systems for determining periods of excessive noise for smart speaker voice commands. An electronic timeline of volume levels of currently playing content is made available to a smart speaker. From this timeline, periods of high content volume are determined, and the smart speaker alerts users during periods of high volume, requesting that they wait until the high-volume period has passed before issuing voice commands. In this manner, the smart speaker helps prevent voice commands that may not be detected, or may be detected inaccurately, due to the noise of the content currently being played.Type: GrantFiled: March 18, 2019Date of Patent: March 9, 2021Assignee: ROVI GUIDES, INC.Inventors: Gyanveer Singh, Sukanya Agarwal, Vikram Makam Gupta
-
Patent number: 10937448Abstract: A voice activity detection method and an apparatus are provided by embodiments of the present application. The method includes: performing framing processing on a voice to be detected to obtain a plurality of audio frames to be detected; obtaining an acoustic feature of each of the audio frames to be detected, and sequentially inputting the acoustic feature of the each of the audio frames to be detected to a VAD model, wherein the VAD model is configured to classify a first N voice frame in the voice to be detected as a noise frame, classify frames from an (N+1)-th voice frame to a last voice frame as voice frames, and classify a M noise frame after the last voice frame as a voice frame, where N and M are integers; and determining, according to a classification result output by the VAD model.Type: GrantFiled: December 27, 2018Date of Patent: March 2, 2021Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Chao Li, Weixin Zhu
-
Patent number: 10917717Abstract: Gain mismatch and related problems can be solved by a system and method that applies an automatic microphone signal gain equalization without any direct absolute reference or calibration phase. The system and method performs the steps of receiving, by a computing device, a speech signal from a speaking person via a plurality of microphones, determining a speech signal component in the time-frequency domain for each microphone of the plurality of microphones, calculating an instantaneous cross-talk coupling matrix based on the speech signal components across the microphones, estimating gain factors based on calculated cross-talk couplings and a given expected cross-talk attenuation, limiting the gain factors to appropriate maximum and minimum values, and applying the gain factors to the speech signal used in the control path to control further speech enhancement algorithms or used in the signal path for direct influence on the speech enhanced audio output signal.Type: GrantFiled: May 30, 2019Date of Patent: February 9, 2021Assignee: Nuance Communications, Inc.Inventors: Timo Matheja, Markus Buck
-
Patent number: 10903863Abstract: A first set of signal data is received. Generative machine learning models are trained based on the first set of signal data. The generative machine learning models include at least a first model trained to identify a first signal component and a second model trained to identify a second signal component. An incoming mixed signal data stream is dynamically separated into a clean signal component and a noise signal component by running the generative machine learning models.Type: GrantFiled: December 11, 2019Date of Patent: January 26, 2021Assignee: International Business Machines CorporationInventors: Francois Pierre Luus, Etienne Eben Vos, Komminist Weldemariam
-
Patent number: 10885902Abstract: Techniques are described for using stenography to protect sensitive information within conversational audio data by generating a pseudo-language representation of conversational audio data. In some implementations, audio data corresponding to an utterance is received. The audio data is classified as likely sensitive audio data. A particular set of sentiments associated with the audio data is determined. Data indicating the particular set of sentiments associated with the audio data is provided to a model. The model is trained to output, for each of different sets of sentiments, desensitized, pseudo-language audio data that exhibits the set of sentiments, and is not classified as likely sensitive audio data. A particular desensitized, pseudo-language audio data is received from the model. The audio data is replaced with the particular desensitized, pseudo-language audio data and stored within an audio data repository.Type: GrantFiled: November 21, 2018Date of Patent: January 5, 2021Assignee: X Development LLCInventors: Antonio Raymond Papania-Davis, Bin Ni, Shelby Lin