Detect Speech In Noise Patents (Class 704/233)
-
Patent number: 10331403Abstract: An audio input system includes an audio input apparatus and a plurality of electronic devices. The audio input apparatus includes an audio input unit, a start instruction accepting unit, and an instruction transmitter. The plurality of electronic devices includes a first electronic device. The first electronic device is a target of a process execution instruction based on the audio input by the audio input unit among the instructions. The instruction transmitter transmits a process reducing instruction for reducing an execution of a process as the instruction to the plurality of electronic devices, when an input sound volume of a microphone exceeds a specific sound volume. The audio input unit starts the audio input when the input sound volume of the microphone is equal to or less than the specific sound volume. The instruction transmitter transmits the process execution instruction to the first electronic device.Type: GrantFiled: February 28, 2018Date of Patent: June 25, 2019Assignee: Kyocera Document Solutions Inc.Inventor: Kazuki Dozen
-
Patent number: 10325612Abstract: A method and apparatus that filters audio data received from a speaking person that includes a specific filter for that speaker. The audio characteristics of the speaker's voice may be collected and the specific filter may be formed to reduce noise while also enhancing voice quality. For instance, if a speaker's voice does not contain specific frequencies, then a filter may cancel the noise at such frequencies to ease noise cancellation and reduce processing sound spectrum for cleaning that is not needed. Additionally, the strength frequencies of a speaker's voice may be identified from the collected audio characteristics and those spectrums can be filtered with finer granularity to provide a speaker specific filter that enhances the voice quality of the speaker's voice data that is transmitted or output by a communication device. The audio data may also be output based upon a user's predefined hearing spectrum.Type: GrantFiled: August 1, 2017Date of Patent: June 18, 2019Assignee: Unify GmbH & Co. KGInventors: Bizhan Karimi-Cherkandi, Farrokh Mohammadzadeh Kouchri, Schah Walli Ali
-
Patent number: 10319377Abstract: A method and system is provided for estimating clean speech parameters from noisy speech parameters. The method is performed by acquiring speech signals, estimating noise from the acquired speech signals, computing speech features from the acquired speech signals, estimating model parameters from the computed speech features and estimating clean parameters from the estimated noise and the estimated model parameters.Type: GrantFiled: February 28, 2017Date of Patent: June 11, 2019Assignee: Tata Consultancy Services LimitedInventors: Ashish Panda, Sunil Kumar Kopparapu
-
Patent number: 10319393Abstract: When an instruction to start voice input is received from the user, a gain controller acquires, from a gain table which defines a correspondence between vehicle speed ranges and gains, a gain corresponding to a vehicle speed range including the vehicle speed of a vehicle detected by a vehicle speed detector, and sets the acquired gain as the gain of an input amplifier that amplifies an input audio signal output by a microphone. As a gain corresponding to each vehicle speed range, the gain table records a gain of the input amplifier corresponding, in an experimentally determined frequency distribution of peak values in the vehicle speed range, to a maximum frequency in the range of magnitude of voice output as an input audio signal by the microphone and to be input to a speech recognition engine as voice having a magnitude within the input range of the speech recognition engine.Type: GrantFiled: July 27, 2016Date of Patent: June 11, 2019Assignee: ALPINE ELECTRONICS, INC.Inventors: Hirokazu Suzuki, Toru Marumoto
-
Patent number: 10319373Abstract: An information processing device includes a phonetic converting unit, an HMM converting unit, and a searching unit. The phonetic converting unit converts a phonetic symbol sequence into a hidden Markov model (HMM) state sequence in which states of an HMM are aligned. The HMM converting unit converts the HMM state sequence into a score vector sequence indicating the degree of similarity to a specific pronunciation using a similarity matrix defining the similarity between the states of the HMM. The searching unit searches for a path having a better score for the score vector sequence than that of the other paths out of paths included in a search network and outputs a phonetic symbol sequence corresponding to the retrieved path.Type: GrantFiled: December 23, 2016Date of Patent: June 11, 2019Assignee: Kabushiki Kaisha ToshibaInventor: Manabu Nagao
-
Patent number: 10297251Abstract: An automatic speech recognition system for a vehicle includes a controller configured to select an acoustic model from a library of acoustic models based on ambient noise in a cabin of the vehicle and operating parameters of the vehicle. The controller is further configured to apply the selected acoustic model to noisy speech to improve recognition of the speech.Type: GrantFiled: January 21, 2016Date of Patent: May 21, 2019Assignee: Ford Global Technologies, LLCInventors: Ali Hassani, Scott Andrew Amman, Francois Charette, Brigitte Frances Mora Richardson, Gintaras Vincent Puskorius, An Ji, Ranjani Rangarajan, John Edward Huber
-
Patent number: 10297283Abstract: A computer-readable medium, controller and a method of automatically recording a sound signal is provided. A sound signal is received by the controller from a sound generating device. A frequency of the received sound signal is determined by the controller. When the determined frequency is within a predetermined frequency range, the controller starts recording the received sound signal.Type: GrantFiled: December 30, 2014Date of Patent: May 21, 2019Assignee: Gibson Brands, Inc.Inventor: Shota Terai
-
Patent number: 10290294Abstract: An information handling system includes a processor configured to operate in one of a plurality of power states. An audio circuit measures an ambient audio environment within the information handling system, classifies the measured ambient audio into one of a plurality of categories, and implements a power management policy for the processor in response to the measured ambient audio being classified into the one of the categories.Type: GrantFiled: November 9, 2017Date of Patent: May 14, 2019Assignee: Dell Products, LPInventors: Ray V. Kacelenga, Merle J. Wood, III, Travis C. North
-
Patent number: 10284724Abstract: According to embodiments of the present invention, various computer implemented methods are provided for generating context sensitive alerts for the purposes of fraud detection and prevention. According to one embodiment, a communication (e.g., a call) is received at (or initiated by) a workstation by a human representative or agent. Verbal (voice) communication between the agent and the third party customer is monitored and converted into text. The agent's activity in applications executing in the workstation is also tracked. The combination of converted text and the activity of the agent is evaluated in a behavior engine to detect inconsistencies between the activity of the agent and authorized or typical activity. When inconsistencies are detected, security actions are performed to alert administrators of potential fraud and, in extreme cases, further action by the agent may be prevented in the workstation to prevent additional fraud.Type: GrantFiled: April 5, 2017Date of Patent: May 7, 2019Assignee: TELEPERFORMANCE SEInventor: Lyle Hardy
-
Patent number: 10276155Abstract: A user media device may include a microphone array and a communication interface. The microphone array may include an omnidirectional microphone and a directional microphone. The microphone array may be selectively switchable. The communication interface may communicatively couple the user media device with a computer and may transmit audio captured by the microphone array to the computer for transfer to a remote service. The remote service may generate text of the processed audio via natural language processing. The remote service may further perform semantic reasoning of the processed audio via a semantic reasoning engine. The remote service may also generate content based at least in part on the semantic reasoning performed on the processed audio. The curated content may include a report having results of the semantic reasoning organized to demonstrate the results in a meaningful way with respect to the processed audio.Type: GrantFiled: December 22, 2016Date of Patent: April 30, 2019Assignee: FUJITSU LIMITEDInventor: James Montantes
-
Patent number: 10264352Abstract: Techniques herein provide wireless energy transfer to audio devices such as headphones, headsets, hearing aids, and the like. Audio devices are integrated with a device resonator. The device resonator may be positioned and oriented to reduce interaction with lossy or sensitive components of the audio device. A repeater resonator and/or a source resonator is integrated into a headrest of a seat or a chair providing continuous power to the headphones while in use. The audio devices may be recharged wirelessly when positioned near source resonators that may be embedded in pads, tables, carrying cases, cups, and the like.Type: GrantFiled: January 9, 2017Date of Patent: April 16, 2019Assignee: WiTricity CorporationInventors: Steven J. Ganem, Hiroshi A. Mendoza, Morris P. Kesler, Konrad J. Kulikowski, Andre B. Kurs, Alexander P. McCauley, Eric R. Giler, Katherine L. Hall, Gozde Guckaya
-
Patent number: 10255487Abstract: A speech determiner determines whether or not a target individual is speaking when facial images of the target individual are captured. An emotion estimator estimates the emotion of the target individual using the facial images of the target individual, on the basis of the determination results of the speech determiner.Type: GrantFiled: September 26, 2016Date of Patent: April 9, 2019Assignee: CASIO COMPUTER CO., LTD.Inventors: Takashi Yamaya, Kouichi Nakagome, Katsuhiko Satoh
-
Patent number: 10249299Abstract: Techniques for tailoring beamforming techniques to environments such that processing resources may be devoted to a portion of an audio signal corresponding to a lobe of a beampattern that is most likely to contain user speech. The techniques take into account both acoustic characteristics of an environment and heuristics regarding lobes that have previously been found to include user speech.Type: GrantFiled: May 1, 2017Date of Patent: April 2, 2019Assignee: Amazon Technologies, Inc.Inventors: Gregory Michael Hart, Kavitha Velusamy, William Spencer Worley, III
-
Patent number: 10250960Abstract: A sound reproduction device includes a signal processing chain configured to render an acoustic useful signal for reproduction to a listener, a simulation scenario processor configured to provide auditory scenario information for a simulated auditory scenario, the simulated auditory scenario influencing perception, by the listener, of the reproduction of the useful signal and/or defining a useful signal type, a user interface configured to detect reproduction parameter settings from a user which represent an individual preference of the listener in view of the simulated auditory scenario, a signal modifier configured to receive the reproduction parameter settings and modify reproduction of the useful signal in dependence on the reproduction parameter settings, and a storage provided for storing the reproduction parameter setting and the auditory scenario information relative to one another.Type: GrantFiled: May 12, 2016Date of Patent: April 2, 2019Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Jens Ekkehart Appell, Jan Rennies-Hochmuth
-
Patent number: 10250975Abstract: A hands-free audio device has a receive path and a transmit path, which may operate at different audio sampling rates. The transmit path has an interference suppressor that receives a reference signal from the receive path and that suppresses interference in microphone signals received from a microphone array. The interference suppressor is followed in the transmit path by a multi-channel adaptive beamformer that produce a plurality of directional audio signals. A beam selector is configured to select one of the directional audio signals based on voice activity, echo detection, and signal energy.Type: GrantFiled: October 6, 2017Date of Patent: April 2, 2019Assignee: Amazon Technologies, Inc.Inventor: Jun Yang
-
Patent number: 10242677Abstract: Various implementations disclosed herein include a training module configured to determining a set of detection normalization threshold values associated with speaker dependent voiced sound pattern (VSP) detection. In some implementations, a method includes obtaining segment templates characterizing a concurrent segmentation of a first subset of a plurality of vocalization instances of a VSP, each segment template provides a stochastic characterization of how a particular portion of the VSP is vocalized by a particular speaker; generating a noisy segment matrix using a second subset of the plurality of vocalization instances of the VSP, wherein the noisy segment matrix includes one or more noisy copies of segment representations of the second subset; scoring segments from the noisy segment matrix against the segment templates; and determining detection normalization threshold values at two or more known SNR levels for at least one particular noise type based on a function of the scoring.Type: GrantFiled: August 25, 2015Date of Patent: March 26, 2019Assignee: MALASPINA LABS (BARBADOS), INC.Inventor: Alexander Escott
-
Patent number: 10244113Abstract: Methods and apparatuses are described for determining customer service quality through digitized voice characteristic measurement and filtering. A voice analysis module captures a first digitized voice segment corresponding to speech submitted by a user of a remote device. The voice analysis module extracts a first set of voice features from the first voice segment, and determines an emotion level of the user based upon the first set of voice features. The voice analysis module captures a second digitized voice segment corresponding to speech submitted by the user. The voice analysis module extracts a second set of voice features from the second voice segment, and determines a change in the emotion level of the user by comparing the first set of voice features to the second set of voice features. The module normalizes the change in the emotion level of the user using emotion influence factors, and generates a service score.Type: GrantFiled: April 26, 2016Date of Patent: March 26, 2019Assignee: FMR LLCInventors: Jason Kao, Xinxin Sheng, Bahram Omidfar, Erkang Zheng
-
Patent number: 10235995Abstract: A user media device may include a microphone array and a communication interface. The microphone array may include an omnidirectional microphone and a directional microphone. The microphone array may be selectively switchable. The communication interface may communicatively couple the user media device with a computer and may transmit audio captured by the microphone array to the computer for transfer to a remote service. The remote service may generate text of the processed audio via natural language processing. The remote service may further perform semantic reasoning of the processed audio via a semantic reasoning engine. The remote service may also generate content based at least in part on the semantic reasoning performed on the processed audio. The curated content may include a report having results of the semantic reasoning organized to demonstrate the results in a meaningful way with respect to the processed audio.Type: GrantFiled: December 22, 2016Date of Patent: March 19, 2019Assignee: FUJITSU LIMITEDInventor: James Montantes
-
Patent number: 10235128Abstract: An embodiments of a contextual sound apparatus may include a sound identifier to identify a sound, a context identifier to identify a context, and an action identifier communicatively coupled to the sound identifier and the context identifier to identify an action based on the identified sound and the identified context. Other embodiments are disclosed and claimed.Type: GrantFiled: May 19, 2017Date of Patent: March 19, 2019Assignee: Intel CorporationInventors: Robert L. Vaughn, James B. Eynard
-
Patent number: 10231070Abstract: A voice input exception determining method, an apparatus, a terminal, and a storage medium are provided. The method is applied to an electronic device including an audio collection module, and includes determining whether an amplitude value of an audio signal collected by the audio collection module is less than a preset amplitude threshold and/or whether energy distribution of the audio signal meets a preset condition; and if the amplitude value of the audio signal is less than the preset amplitude threshold and/or the energy distribution of the audio signal does not meet the preset condition, determining that voice input of the electronic device is abnormal. A solution provided in the present disclosure resolves a problem that there is no effective method for determining a sound reception exception caused when a sound reception hole of the electronic device is blocked.Type: GrantFiled: April 29, 2016Date of Patent: March 12, 2019Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Lin Yang, Zhaoyang Yin, Jingwen Yang
-
Patent number: 10229686Abstract: Methods and apparatus to process microphone signals by a speech enhancement module to generate an audio stream signal including first and second metadata for use by a speech recognition module. In an embodiment, speech recognition is performed using endpointing information including transitioning from a silence state to a maybe speech state, in which data is buffered, based on the first metadata and transitioning to a speech state, in which speech recognition is performed, based upon the second metadata.Type: GrantFiled: August 18, 2014Date of Patent: March 12, 2019Assignee: NUANCE COMMUNICATIONS, INC.Inventors: Markus Buck, Tobias Herbig, Simon Graf, Christophe Ris
-
Patent number: 10222447Abstract: A method for locating a sound source by maximizing a directed response strength calculated for a plurality of vectors of the interauricular time differences forming a set comprises: a first subset of vectors compatible with sound signals from a single sound source at an unlimited distance from the microphones; and a second subset of vectors not compatible with sound signals from a signal sound source at an unlimited distance from the microphones. Each vector of the first subset is associated with a direction for locating the corresponding single sound source, and each vector of the second subset is associated with the locating direction of a vector of the first subset closest thereto according to a predefined metric. A humanoid robot including: a set of at least three microphones, arranged on a surface higher than the head of thereof; and a processor for implementing one such method is provided.Type: GrantFiled: September 29, 2014Date of Patent: March 5, 2019Assignee: SOFTBANK ROBOTICS EUROPEInventor: Grégory Rump
-
Patent number: 10217477Abstract: An electronic device and a speech recognition method that is capable of adjusting an end-of-utterance detection period dynamically are disclosed. The electronic device includes a microphone, a display, an input device formed as a part of the display or connected to the electronic device as a separate device, a processor electrically connected to the microphone, the display, and the input device, and a memory electrically connected to the processor. The memory stores instructions, executable by the processor, for receiving an utterance input by a user through the microphone, converting the utterance to text comprised of a series of words or phrases with spaces, displaying the text on the display, the text comprising at least one space formed at an incorrect position, and receiving a user input for updating a predetermined time period through the input device.Type: GrantFiled: October 31, 2016Date of Patent: February 26, 2019Assignee: Samsung Electronics Co., Ltd.Inventors: Sungwoon Jang, Sangwook Shin, Sungwan Youn
-
Patent number: 10210857Abstract: A method of controlling an audio system comprises: receiving an audio signal, and applying a first gain to the audio signal and outputting an amplified audio signal. On receiving a user input to increase the first gain applied to the audio signal, if the first gain is at a first threshold value, the method comprises: receiving an ambient noise signal, processing the ambient noise signal with a second gain value and outputting a noise cancellation signal, and changing the second gain value in response to the user input.Type: GrantFiled: October 18, 2017Date of Patent: February 19, 2019Assignee: Cirrus Logic, Inc.Inventors: Nigel Burgess, Mark Allan Watts, Darren Holding
-
Patent number: 10204619Abstract: Methods, systems, and apparatus are described that receive audio data for an utterance. Association data is accessed that indicates associations between data corresponding to uncorrupted audio segments, and data corresponding to corrupted versions of the uncorrupted audio segments, where the associations are determined before receiving the audio data for the utterance. Using the association data and the received audio data for the utterance, data corresponding to at least one uncorrupted audio segment is selected. A transcription of the utterance is determined based on the selected data corresponding to the at least one uncorrupted audio segment.Type: GrantFiled: February 22, 2016Date of Patent: February 12, 2019Assignee: Google LLCInventors: Olivier Siohan, Pedro J. Moreno Mengibar
-
Patent number: 10194259Abstract: Various implementations include wearable audio devices and related methods for controlling such devices. In some particular implementations, a computer-implemented method of controlling a wearable audio device includes: receiving an initiation command to initiate a spatial audio mode; providing a plurality of audio samples corresponding with spatially delineated zones in an array defined relative to a physical position of the wearable audio device, in response to the initiation command, where each audio sample is associated with a source of audio content; receiving a selection command selecting one of the plurality of audio samples; and initiating playback of the source of audio content associated with the selected audio sample.Type: GrantFiled: February 28, 2018Date of Patent: January 29, 2019Assignee: BOSE CORPORATIONInventors: Keith Dana Martin, Todd Richard Reily, Mark Raymond Blewett, Daniel M. Gauger, Jr.
-
Patent number: 10192547Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating an accent source. A system practicing the method collects data associated with customer specific services, generates country-specific or dialect-specific weights for each service in the customer specific services list, generates a summary weight based on an aggregation of the country-specific or dialect-specific weights, and sets an interactive voice response system language model based on the summary weight and the country-specific or dialect-specific weights. The interactive voice response system can also change the user interface based on the interactive voice response system language model. The interactive voice response system can tune a voice recognition algorithm based on the summary weight and the country-specific weights. The interactive voice response system can adjust phoneme matching in the language model based on a possibility that the speaker is using other languages.Type: GrantFiled: April 21, 2016Date of Patent: January 29, 2019Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.Inventor: Nicholas Duffield
-
Patent number: 10182787Abstract: The present invention relates to systems and methods for characterizing at least one anatomical parameter of an upper airway of a patient by analysing spectral properties of an utterance, comprising: a mechanical coupler comprising means for restricting the jaw position of the patient; means for recording an utterance; and processing means for determining at least one anatomical parameter of the upper airway from the recorded utterance and comparing the recorded utterance to a threshold value. In addition the present invention relates to the use of the above mentioned systems as a diagnostics tool for assessing obstructive sleep apnea.Type: GrantFiled: October 11, 2012Date of Patent: January 22, 2019Assignee: KONINKLIJKE PHILIPS N.V.Inventors: Stijn De Waele, Stefan Winter, Alexander Cornelis Geerlings
-
Patent number: 10186263Abstract: Speech recognition of a stream of spoken utterances is initiated. Thereafter, a spoken utterance stop event to stop the speech recognition is detected, such as in in relation to the stream. The spoken utterance stop event is other than a pause or cessation in the stream of spoken utterances. In response to the spoken utterance stop event being detected, the speech recognition of the stream of spoken utterances is stopped, while the stream of spoken utterances continues. After stopping the speech recognition of the stream of spoken utterances has been stopped, an action is caused to be performed that corresponds to the spoken utterances from a beginning of the stream through and until the spoken utterance stop event.Type: GrantFiled: August 30, 2016Date of Patent: January 22, 2019Assignee: Lenovo Enterprise Solutions (Singapore) PTE. LTD.Inventors: Amy Leigh Rose, John Scott Crowe, Gary David Cudak, Jennifer J. Lee-Baron, Nathan J. Peterson, Bryan L. Young
-
Patent number: 10178113Abstract: Systems, methods, and media for generating sanitized data, sanitizing anomaly detection models, and generating anomaly detection models are provided. In some embodiments, methods for sanitizing anomaly detection models are provided. The methods including: receiving at least one abnormal anomaly detection model from at least one remote location; comparing at least one of the at least one abnormal anomaly detection model to a local normal detection model to produce a common set of features common to both the at least one abnormal anomaly detection model and the local normal detection model; and generating a sanitized normal anomaly detection model by removing the common set of features from the local normal detection model.Type: GrantFiled: July 13, 2015Date of Patent: January 8, 2019Assignee: The Trustees of Columbia University in the City of New YorkInventors: Gabriela F. Ciocarlie, Angelos Stavrou, Salvatore J. Stolfo, Angelos D. Keromytis
-
Patent number: 10134395Abstract: Techniques for providing virtual assistants to assist users during a voice communication between the users. For instance, a first user operating a device may establish a voice communication with respective devices of one or more additional users, such as with a device of a second user. For instance, the first user may utilize her device to place a telephone call to the device of the second user. A virtual assistant may also join the call and, upon invocation by a user on the call, may identify voice commands from the call and may perform corresponding tasks for the users in response.Type: GrantFiled: September 25, 2013Date of Patent: November 20, 2018Assignee: Amazon Technologies, Inc.Inventor: Marcello Typrin
-
Patent number: 10123221Abstract: A communication device can be configured to estimate a Reference-Signal-Received-Power (RSRP). The communication device can include a transceiver and a controller. The transceiver can be configured to downsample a received signal having a plurality of reference signal resource elements to generate a downsampled signal. The downsampling can alias a first of the plurality of reference signal resource elements into a second of the plurality of reference signal resource elements to generate an aliased reference signal resource element. The transceiver can also be configured to extract the aliased reference signal resource element from the downsampled signal. The controller can be connected to the transceiver and be configured to estimate the RSRP based on the extracted aliased reference signal resource element.Type: GrantFiled: September 23, 2016Date of Patent: November 6, 2018Assignee: Intel IP CorporationInventors: Matthew Hayes, Denis Markovic, Viswanath Vajepeyazula
-
Patent number: 10115019Abstract: A video may be categorized into a picture category or a video category. A key frame of the video includes a face and a face feature in the key frame is obtained. Face features respectively associated with a plurality of picture categories are acquired and the video is assigned to one of the picture categories based on a comparison of the key frame face feature and the face features of the picture categories. Videos may first be associated with a video category by comparing key frame face features from the videos, and then the video category may be assigned to a picture category based on comparison of a video category face feature with a plurality of picture category face features. Alternatively, a video may be assigned to a picture category based on matching capture times and capture locations between the video and a reference picture in the picture category.Type: GrantFiled: August 19, 2016Date of Patent: October 30, 2018Assignee: Xiaomi Inc.Inventors: Zhijun Chen, Wendi Hou, Fei Long
-
Patent number: 10109277Abstract: Methods and apparatus for using visual information to facilitate a speech recognition process. The method comprises dividing received audio information into a plurality of audio frames, determining for each of the plurality of audio frames, whether the audio information in the audio frame comprises speech from the foreground speaker, wherein the determining is based, at least in part, on received visual information, and transmitting the audio frame to an automatic speech recognition (ASR) engine for speech recognition when it is determined that the audio frame comprises speech from the foreground speaker.Type: GrantFiled: April 27, 2015Date of Patent: October 23, 2018Assignee: Nuance Communications, Inc.Inventors: Etienne Marcheret, Josef Vopicka, Vaibhava Goel
-
Patent number: 10102850Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.Type: GrantFiled: February 25, 2013Date of Patent: October 16, 2018Assignee: AMAZON TECHNOLOGIES, INC.Inventors: Kenneth John Basye, Jeffrey Penrod Adams
-
Patent number: 10090005Abstract: According to some embodiments, an analog processing portion may receive an audio signal from a microphone. The analog processing portion may then convert the audio signal into sub-band signals and estimate an energy statistic value, such as a Signal-to-Noise Ratio (“SNR”) value, for each sub-band signal. A classification element may classify the estimated energy statistic values with analog processing such that a wakeup signal is generated when voice activity is detected. The wakeup signal may be associated with, for example, a battery-powered, always-listening audio application.Type: GrantFiled: March 10, 2017Date of Patent: October 2, 2018Assignee: ASPINITY, INC.Inventors: Brandon David Rumberg, David W. Graham
-
Patent number: 10088893Abstract: According to some embodiments, a sensor network may be provided with re-programmable and/or reconfigurable analog circuitry configured to monitor data collected by the sensor network. The re-programmable and/or reconfigurable analog circuitry may also generate a wakeup signal in response to a defined wakeup event detected by the sensor network.Type: GrantFiled: May 12, 2017Date of Patent: October 2, 2018Inventors: Vinod Kulathumani, David W. Graham, Brandon David Rumberg
-
Patent number: 10084475Abstract: An improved mixed oscillator-and-external excitation model and methods for estimating the model parameters, for evaluating model quality, and for combining it with known in the art methods are disclosed. The improvement over existing oscillators allows the model to receive, as an input, all except the most recent point in the acquired data. Model stability is achieved through a process which includes restoring unavailable to the decoder data from the optimal model parameters and by using metrics to select a stable restored model output. The present invention is effective for very low bit-rate coding/compression and decoding/decompression of digital signals, including digitized speech, audio, and image data, and for analysis, detection, and classification of signals. Operations can be performed in real time, and parameterization can be achieved at a user-specified level of compression.Type: GrantFiled: October 28, 2011Date of Patent: September 25, 2018Inventors: Irina Gorodnitsky, Anton Yen
-
Patent number: 10083710Abstract: A voice control system including a voice receiving unit, an image capturing unit, a storage unit and a control unit is disclosed. The voice receiving unit receives a voice. The image capturing unit captures a video image stream including several human face images. The storage unit stores the voice and the video image stream. The control unit is electrically connected to the voice receiving unit, the image capturing unit and the storage unit. The control unit detects a feature of a human face from the human face images, defines a mouth motion detection region from the feature of the human face, and generates a control signal according to a variation of the mouth motion detection region and a variation of the voice over time. A voice control method, a computer program product and a computer readable medium are also disclosed.Type: GrantFiled: May 17, 2016Date of Patent: September 25, 2018Assignee: BXB Electronics Co., Ltd.Inventors: Kai-Sheng Chiou, Chih-Lin Hung, Chung-Nan Lee, Chao-Wen Wu
-
Patent number: 10083687Abstract: Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information based on a previously recorded time and speed of the caller, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model.Type: GrantFiled: October 16, 2017Date of Patent: September 25, 2018Assignee: NUANCE COMMUNICATIONS, INC.Inventor: Mazin Gilbert
-
Patent number: 10078690Abstract: It is provided a method for triggering an action on a second device. It comprises the steps of obtaining audio of a multimedia content presented on a first device; comparing the obtained audio with reference audio data in a database; if finding the obtained audio exists in the database containing reference audio, determining an action corresponding to the matched reference audio; and triggering the action in the second device.Type: GrantFiled: December 31, 2011Date of Patent: September 18, 2018Assignee: Thomson Licensing DTVInventors: Jianfeng Chen, Xiaojun Ma, Zhigang Zhang
-
Patent number: 10074369Abstract: Systems, methods, and devices for escalating voice-based interactions via speech-controlled devices are described. Speech-controlled devices capture audio, including wakeword portions and payload portions, for sending to a server to relay messages between speech-controlled devices. In response to determining the occurrence of an escalation event, such as repeated messages between the same two devices, the system may automatically change a mode of a speech-controlled device, such as no longer requiring a wakeword, no longer requiring an indication of a desired recipient, or automatically connecting two speech-controlled devices in a voice-chat mode. In response to determining the occurrence of further escalation events, the system may initiate a real-time call between the speech-controlled devices.Type: GrantFiled: September 1, 2016Date of Patent: September 11, 2018Assignee: Amazon Technologies, Inc.Inventors: Christo Frank Devaraj, Manish Kumar Dalmia, Tony Roy Hardie, Ran Mokady, Nick Ciubotariu, Sandra Lemon
-
Patent number: 10074360Abstract: This relates to providing an indication of the suitability of an acoustic environment for performing speech recognition. One process can include receiving an audio input and determining a speech recognition suitability based on the audio input. The speech recognition suitability can include a numerical, textual, graphical, or other representation of the suitability of an acoustic environment for performing speech recognition. The process can further include displaying a visual representation of the speech recognition suitability to indicate the likelihood that a spoken user input will be interpreted correctly. This allows a user to determine whether to proceed with the performance of a speech recognition process, or to move to a different location having a better acoustic environment before performing the speech recognition process.Type: GrantFiled: August 24, 2015Date of Patent: September 11, 2018Assignee: Apple Inc.Inventor: Yoon Kim
-
Patent number: 10049653Abstract: A system including an automatic noise canceling (ANC) headphone and a processor. The ANC headphone has a microphone configured to generate a microphone signal and at least two non-zero ANC gain levels. The processor is configured to receive the microphone signal, determine a characteristic of the microphone signal, identify a revised ANC level from the ANC gain levels based on a comparison of the characteristic to at least one threshold, and output a signal corresponding to the revised ANC level. Methods are also disclosed.Type: GrantFiled: October 14, 2016Date of Patent: August 14, 2018Assignee: AVNERA CORPORATIONInventors: Amit Kumar, Eric Sorensen
-
Patent number: 10032454Abstract: Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results.Type: GrantFiled: June 25, 2015Date of Patent: July 24, 2018Assignee: Nuance Communications, Inc.Inventors: Shilei Zhang, Shenghua Bao, Wen Liu, Yong Qin, Zhiwei Shuang, Jian Chen, Zhong Su, Qin Shi, William F. Ganong, III
-
Patent number: 10026399Abstract: Architectures and techniques for selecting a voice-enabled device to handle audio input that is detected by multiple voice-enabled devices are described herein. In some instances, multiple voice-enabled devices may detect audio input from a user at substantially the same time, due to the voice-enabled devices being located within proximity to the user. The architectures and techniques may analyze a variety of audio signal metric values for the voice-enabled devices to designate a voice-enabled device to handle the audio input.Type: GrantFiled: September 11, 2015Date of Patent: July 17, 2018Assignee: Amazon Technologies, Inc.Inventors: Ramya Gopalan, Shiva Kumar Sundaram
-
Patent number: 10014002Abstract: Methods and systems for audio source separation in real-time are described. In an embodiment, the present disclosure describes reading and decoding an audio source into PCM samples, fragmenting Pulse Code Modulation (PCM) samples into fragments, transforming fragments into spectrograms, performing audio source separation using a deep neural network (DNN) to generate an estimated magnitude spectrogram of the component(s) of the audio source, reconstructing the estimated time domain component signals, and streaming the component signals to a playback engine. In an embodiment, a semantic equalizer graphical user allows for real-time mixing of individual component signals.Type: GrantFiled: October 24, 2017Date of Patent: July 3, 2018Assignee: Red Pill VR, Inc.Inventors: Alejandro Koretzky, Karthiek Reddy Bokka, Naveen Sasalu Rajashekharappa
-
Patent number: 9990936Abstract: A method and an apparatus for separating speech data from background data in an audio communication are suggested. The method comprises: applying a speech model to the audio communication for separating the speech data from the background data of the audio communication; and updating the speech model as a function of the speech data and the background data during the audio communication.Type: GrantFiled: October 12, 2015Date of Patent: June 5, 2018Assignee: THOMSON LicensingInventors: Alexey Ozerov, Quang Khanh Ngoc Duong, Louis Chevallier
-
Patent number: 9984683Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for automatic speech recognition using multi-dimensional models. In some implementations, audio data that describes an utterance is received. A transcription for the utterance is determined using an acoustic model that includes a neural network having first memory blocks for time information and second memory blocks for frequency information. The transcription for the utterance is provided as output of an automated speech recognizer.Type: GrantFiled: July 22, 2016Date of Patent: May 29, 2018Assignee: Google LLCInventors: Bo Li, Tara N. Sainath
-
Patent number: 9978378Abstract: An apparatus for decoding an audio signal is provided, having a receiving interface, configured to receive a first frame having a first audio signal portion of the audio signal, and configured to receive a second frame having a second audio signal portion of the audio signal; a noise level tracing unit, wherein the noise level tracing unit is configured to determine noise level information depending on at least one of the first audio signal portion and the second audio signal portion; a first reconstruction unit for reconstructing, in a first reconstruction domain, a third audio signal portion of the audio signal depending on the noise level information; a transform unit for transforming the noise level information to a second reconstruction domain; and a second reconstruction unit for reconstructing, in the second reconstruction domain, a fourth audio signal portion of the audio signal depending on the noise level information.Type: GrantFiled: December 21, 2015Date of Patent: May 22, 2018Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Michael Schnabel, Goran Markovic, Ralph Sperschneider, Jérémie Lecomte, Christian Helmrich