Silence Decision Patents (Class 704/215)
  • Patent number: 8924200
    Abstract: A method for decoding an audio signal in a decoder having a CELP-based decoder element including a fixed codebook component, at least one pitch period value, and a first decoder output, wherein a bandwidth of the audio signal extends beyond a bandwidth of the CELP-based decoder element. The method includes obtaining an up-sampled fixed codebook signal by up-sampling the fixed codebook component to a higher sample rate, obtaining an up-sampled excitation signal based on the up-sampled fixed codebook signal and an up-sampled pitch period value, and obtaining a composite output signal based on the up-sampled excitation signal and an output signal of the CELP-based decoder element, wherein the composite output signal includes a bandwidth portion that extends beyond a bandwidth of the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: December 30, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Patent number: 8924207
    Abstract: A method and apparatus for transcoding audio data. The method includes determining if AAC joint stereo exists, running a reference AC-3 rematrixing when the AAC joint stereo does not exist, when AAC joint stereo does exist, enabling rematrixing when the number of corresponding AAC bands is greater than half the size of the band, otherwise, running reference AC-3 rematrixing.
    Type: Grant
    Filed: July 20, 2010
    Date of Patent: December 30, 2014
    Assignee: Texas Instruments Incorporated
    Inventor: Mohamed Farouk Mansour
  • Patent number: 8913512
    Abstract: A telecommunication apparatus (100, 200) enabled for high-speed packet access is disclosed. The apparatus (100, 200) is arranged to operate according to a reduced and a further reduced mode of transmission of dedicated physical control channel transmission, and having a data transmission controller (102, 202) arranged to control sporadic data transmissions. The data transmission controller (102, 202) is arranged to determine if omission of a sporadic data transmission will significantly degrade performance, and if not, disable transmission of that data transmission. A method of controlling sporadic data transmissions for such an apparatus is also disclosed, as well as a computer program for implementing the method.
    Type: Grant
    Filed: October 16, 2008
    Date of Patent: December 16, 2014
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Hans Hannu, Jan Christoffersson, Min Wang
  • Patent number: 8909519
    Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: March 10, 2014
    Date of Patent: December 9, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 8909524
    Abstract: Embodiments of the present invention provide an adaptive noise canceling system. The adaptive noise canceling system may be used in a handset to cancel background noise by generating an anti-noise signal. The adaptive noise canceling system may include first input to receive a first signal from a feedforward microphone; a second input to receive a second signal from an error microphone; a controller coupled to the inputs, the controller configured to adaptively generate an anti-noise signal according to the received signals, wherein the controller derives a profile of the anti-noise signal from the first signal and derives a magnitude of the anti-noise signal from both first and second signal; and an output to transmit the anti-noise signal to a speaker.
    Type: Grant
    Filed: June 7, 2011
    Date of Patent: December 9, 2014
    Assignee: Analog Devices, Inc.
    Inventors: Thomas Stoltz, Kim Spetzler Berthelsen, Robert Adams
  • Patent number: 8892229
    Abstract: An audio apparatus according to an embodiment includes an audio signal receiving unit, a music gap signal receiving unit, a playback unit, and a determining unit. The audio signal receiving unit receives an audio signal in which successive multiple music data are contained in a single block of data. The determining unit determines a boundary of the music data on the basis of the time at which the music gap signal that indicates the boundary of the music data by the music gap signal receiving unit and the duration of a silent period in the audio signal that is played back by the playback unit.
    Type: Grant
    Filed: May 14, 2012
    Date of Patent: November 18, 2014
    Assignee: Fujitsu Ten Limited
    Inventors: Osamu Yasutake, Fumitake Nakamura, Nobutaka Miyauchi, Masanobu Maeda, Masahiko Kubo, Nahoko Kawamura, Machiko Matsui, Hideto Saitoh, Hiroyuki Kubota, Masayuki Takaoka, Masanobu Washio, Yutaka Nishioka
  • Patent number: 8886527
    Abstract: A purpose is to suppress recognition process delay generated due to load in signal processing. Included is a speech input means 10 that inputs a speech signal, an output evaluation means 20 that evaluates whether or not the speech signal input by the speech input means 10 is the speech signal in a sound section, which is a speech section assuming that a speaker is speaking, and outputs the speech signal as a speech signal to be processed only when evaluated as the speech signal in the sound section, a signal processing means 30 that performs signal processing to the speech signal, which is output by the output evaluation means 20 as the speech signal to be processed, and a speech recognition processing means 40 that performs a speech recognition process to the speech signal which is signal-processed by the signal processing means 30.
    Type: Grant
    Filed: April 16, 2009
    Date of Patent: November 11, 2014
    Assignee: NEC Corporation
    Inventor: Toru Iwasawa
  • Patent number: 8886529
    Abstract: A method and device are provided for the objective evaluation of voice quality of a speech signal. The device includes: a module for extracting a background noise signal, referred to as a noise signal, from the speech signal; a module for calculating the audio parameters of the noise signal; a module for classifying the background noise contained in the noise signal on the basis of the calculated audio parameters, according to a predefined set of background noise classes; and a module for evaluating the voice quality of the speech signal on the basis of at least the resulting classification relative to the background noise in the speech signal.
    Type: Grant
    Filed: April 12, 2010
    Date of Patent: November 11, 2014
    Assignee: France Telecom
    Inventors: Julien Faure, Adrien Leman
  • Patent number: 8868432
    Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: October 21, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Patent number: 8856001
    Abstract: A speech sound detection apparatus receives an input audio signal (as a sound reception unit), and computes input power that indicates a magnitude of the sound represented by the audio signal (as an input power computation unit). The apparatus estimates a correction function that is a continuous function defining a relation between a certain frequency and a correction coefficient used to approximate the input power computed at that frequency to the reference power predetermined for that frequency (as a correction function estimation unit). The apparatus corrects the input power at every frequency, based upon the correction coefficient that is obtained in accordance with the relation defined by the estimated correction function (as an input power correcting unit). The apparatus further determines whether or not the sound represented by the received audio signal is speech sound, based upon the corrected input power (as a speech sound detection unit).
    Type: Grant
    Filed: September 3, 2009
    Date of Patent: October 7, 2014
    Assignee: NEC Corporation
    Inventors: Tadashi Emori, Masanori Tsujikawa
  • Patent number: 8825478
    Abstract: Audio content is converted to text using speech recognition software. The text is then associated with a distinct voice or a generic placeholder label if no distinction can be made. From the text and voice information, a word cloud is generated based on key words and key speakers. A visualization of the cloud displays as it is being created. Words grow in size in relation to their dominance. When it is determined that the predominant words or speakers have changed, the word cloud is complete. That word cloud continues to be displayed statically and a new word cloud display begins based upon a new set of predominant words or a new predominant speaker or set of speakers. This process may continue until the meeting is concluded. At the end of the meeting, the completed visualization may be saved to a storage device, sent to selected individuals, removed, or any combination of the preceding.
    Type: Grant
    Filed: January 10, 2011
    Date of Patent: September 2, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Susan Marie Cox, Janani Janakiraman, Fang Lu, Loulwa F Salem
  • Patent number: 8818811
    Abstract: This application relates to a voice activity detection (VAD) apparatus configured to provide a voice activity detection decision for an input audio signal. The VAD apparatus includes a state detector and a voice activity calculator. The state detector is configured to determine, based on the input audio signal, a current working state of the VAD apparatus among at least two different working states. Each of the at least two different working states is associated with a corresponding working state parameter decision set which includes at least one voice activity decision parameter. The voice activity calculator is configured to calculate a voice activity detection parameter value for the at least one voice activity decision parameter of the working state parameter decision set associated with the current working state, and to provide the voice activity detection decision by comparing the calculated voice activity detection parameter value with a threshold.
    Type: Grant
    Filed: June 24, 2013
    Date of Patent: August 26, 2014
    Assignee: Huawei Technologies Co., Ltd
    Inventor: Zhe Wang
  • Patent number: 8781821
    Abstract: A method is disclosed for controlling a voice-activated device by interpreting a spoken command as a series of voiced and non-voiced intervals. A responsive action is then performed according to the number of voiced intervals in the command. The method is well-suited to applications having a small number of specific voice-activated response functions. Applications using the inventive method offer numerous advantages over traditional speech recognition systems including speaker universality, language independence, no training or calibration needed, implementation with simple microcontrollers, and extremely low cost. For time-critical applications such as pulsers and measurement devices, where fast reaction is crucial to catch a transient event, the method provides near-instantaneous command response, yet versatile voice control.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: July 15, 2014
    Assignee: Zanavox
    Inventor: David Edward Newman
  • Patent number: 8775168
    Abstract: A Yule-Walker based, low-complexity voice activity detector (VAD) is disclosed. An input signal is typically noisy speech (i.e., corrupted with, for example, babble noise). In one embodiment, a first initialization stage of the VAD computes an occurrence of a silent period within the input signal and the AR parameters. The VAD could accordingly compute a tentative adaptive threshold and output hypothesis H1 (which means speech is present) during this stage. During the second initialization stage, the VAD generally builds a database of associated values and computes the adaptive threshold accordingly. The second initialization stage could also output tentative VAD decisions based on the tentative threshold computed in the first initialization stage. Finally, the VAD periodically retrains or updates AR parameters, threshold values and/or the database and outputs VAD decisions accordingly.
    Type: Grant
    Filed: August 3, 2007
    Date of Patent: July 8, 2014
    Assignee: STMicroelectronics Asia Pacific PTE, Ltd.
    Inventors: Karthik Muralidhar, Anoop Kumar Krishna
  • Patent number: 8762144
    Abstract: A method and apparatus for detecting voice activity are disclosed. The method of detecting voice activity includes: extracting a feature parameter from a frame signal; determining whether the frame signal is a voice signal or a noise signal by comparing the feature parameter with model parameters of a plurality of comparison signals, respectively; and outputting the frame signal when the frame signal is determined to be a voice signal. The apparatus includes a classifier module which extracts a feature parameter from a frame signal, and generating labeling information with respect to the frame signal by comparing the feature parameter with model parameters of a plurality of comparison signals; and a voice detection unit which determines whether the frame signal is a noise signal or a voice signal with reference to the labeling information, and outputting the frame signal when the frame signal is determined to be a voice signal.
    Type: Grant
    Filed: May 3, 2011
    Date of Patent: June 24, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Nam-gook Cho, Eun-kyoung Kim
  • Patent number: 8762158
    Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.
    Type: Grant
    Filed: August 5, 2011
    Date of Patent: June 24, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
  • Patent number: 8751223
    Abstract: In one implementation, a first voice stream for a packet-switched call is received from a calling party. The first voice stream conforms to a first silence suppression scheme and comprises a plurality of encoded packets for the packet-switched call. A subset of encoded packets are selected from the plurality of encoded packets to create a second voice stream that conforms to a second silence suppression scheme. The second voice stream comprises the subset of encoded packets. The first silence suppression scheme is distinct from the second silence suppression scheme. The second voice stream is forwarded toward a called party for the packet-switched call.
    Type: Grant
    Filed: May 24, 2011
    Date of Patent: June 10, 2014
    Assignee: Alcatel Lucent
    Inventors: Jeffrey A. Hiltner, Alan H. Matten, Dale R. Schumacher, Albert J. Such
  • Patent number: 8731914
    Abstract: A system and method for locating a preferable playback start location after a winding or rewinding action in an audio playing device. In response to an adjustment of the playing location for audio content to a desired playing position, the system determines whether at least one non-speech or silent period of at least a predetermined duration exists within the vicinity of the desired playing position. If at least one such non-speech or silent period exists within the vicinity of the desired playing position, the system adjusts the playing position to fall within one of the at least one non-speech period or silent period.
    Type: Grant
    Filed: November 15, 2005
    Date of Patent: May 20, 2014
    Assignee: Nokia Corporation
    Inventors: Janne Vainio, Hannu J. Mikkola, Jari M. Makinen
  • Patent number: 8700390
    Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: October 7, 2013
    Date of Patent: April 15, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 8687831
    Abstract: An external device for a hearing implant system and a hearing implant system having an external device is described. An external transmitter generates a radio-frequency inductive link signal to an implanted receiver including a sequence of data word segments which communicate data to the implanted receiver, and a sequence of data word pause segments between each data word segment which communicate energy without data to the implanted receiver. A data word pause controller controls the inductive link signal during the data word pause segments according to an energy management rule.
    Type: Grant
    Filed: October 26, 2012
    Date of Patent: April 1, 2014
    Assignee: Med-El Elektromedizinische Geraete GmbH
    Inventors: Martin Stoffaneller, Peter Schleich, Thomas Schwarzenbeck
  • Patent number: 8682662
    Abstract: In accordance with an example embodiment of the invention, there is provided an apparatus for detecting voice activity in an audio signal. The apparatus comprises a first voice activity detector for making a first voice activity detection decision based at least in part on the voice activity of a first audio signal received from a first microphone. The apparatus also comprises a second voice activity detector for making a second voice activity detection decision based at least in part on an estimate of a direction of the first audio signal and an estimate of a direction of a second audio signal received from a second microphone. The apparatus further comprises a classifier for making a third voice activity detection decision based at least in part on the first and second voice activity detection decisions.
    Type: Grant
    Filed: August 13, 2012
    Date of Patent: March 25, 2014
    Assignee: Nokia Corporation
    Inventors: Riitta Elina Niemistö, Päivi Marianna Valve
  • Patent number: 8676572
    Abstract: A computer-implemented system and method for enhancing audio to individuals participating in a conversation is provided. Audio data for individuals participating in one or more conversations is analyzed. Possible conversational configurations of the individuals are generated based on the audio data, and each possible conversational configuration includes one or more subconfigurations of at least two of the individuals. A probability weight is assigned to each of the subconfigurations and includes a likelihood that the individuals of that subconfiguration are participating in one of the conversations. A probability of each possible conversational configuration is determined by combining the probability weights for the subconfigurations of that possible conversational configuration. The possible conversational configuration with the highest probability is selected as a most probable configuration. The individuals participating in the conversations are determined based on the most probable configuration.
    Type: Grant
    Filed: March 14, 2013
    Date of Patent: March 18, 2014
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Paul M. Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison G. Woodruff
  • Patent number: 8670990
    Abstract: Systems and methods are described that utilize dynamic time scale modification (TSM) to achieve reduced bit rate audio coding. In accordance with embodiments, different levels of TSM compression are selectively applied to segments of an input speech signal prior to encoding thereof by an encoder. Encoded TSM-compressed segments are received at a decoder which decodes such segments and then applies an appropriate level of TSM decompression to each based on information received from the encoder. By selectively applying different levels of TSM compression to segments of an input speech signal prior to encoding, a coding bit rate associated with the encoder/decoder is reduced. Furthermore, by selecting a level of TSM compression for each segment of the input speech signal that takes into account certain local characteristics of that signal, such bit rate reduction is provided without introducing unacceptable levels of distortion into an output speech signal produced by the decoder.
    Type: Grant
    Filed: July 30, 2010
    Date of Patent: March 11, 2014
    Assignee: Broadcom Corporation
    Inventors: Juin-Hwey Chen, Hong-Goo Kang, Robert W. Zopf, Jes Thyssen
  • Patent number: 8654933
    Abstract: A mass-scale, user-independent, device-independent, voice messaging system that converts unstructured voice messages into text for display on a screen is disclosed. The system comprises (i) computer implemented sub-systems and also (ii) a network connection to human operators providing transcription and quality control; the system being adapted to optimize the effectiveness of the human operators by further comprising 3 core sub-systems, namely (i) a pre-processing front end that determines an appropriate conversion strategy; (ii) one or more conversion resources; and (iii) a quality control sub-system.
    Type: Grant
    Filed: October 31, 2007
    Date of Patent: February 18, 2014
    Assignee: Nuance Communications, Inc.
    Inventor: Daniel Michael Doulton
  • Patent number: 8645133
    Abstract: Encoding audio signals with selecting an encoding mode for encoding the signal categorizing the signal into active segments having voice activity and non-active segments having substantially no voice activity by using categorization parameters depending on the selected encoding mode and encoding at least the active segments using the selected encoding mode.
    Type: Grant
    Filed: February 7, 2013
    Date of Patent: February 4, 2014
    Assignee: Core Wireless Licensing S.a.r.l.
    Inventors: Kari Järvinen, Pasi Ojala, Ari Lakaniemi
  • Patent number: 8626498
    Abstract: A voice activity detection (VAD) system includes a first voice activity detector, a second voice activity detector and control logic. The first voice activity detector is included in a device and produces a first VAD signal. The second voice activity detector is located externally to the device and produces a second VAD signal. The control logic combines the first and second VAD signals into a VAD output signal. Voice activity may be detected based on the VAD output signal. The second VAD signal can be represented as a flag included in a packet containing digitized audio. The packet can be transmitted to the device from the externally located VAD over a wireless link.
    Type: Grant
    Filed: February 24, 2010
    Date of Patent: January 7, 2014
    Assignee: QUALCOMM Incorporated
    Inventor: Te-Won Lee
  • Patent number: 8589153
    Abstract: A continuous comfort noise is provided that is overlaid for the entire duration of a conference call scenario. The comfort noise may be adapted to match the levels of the actual background noise detected on one or more of the conference call participant's devices on the transmitting end(s) of a conference call as well as the participants' speech levels. The comfort noise may also be adapted to the type of listening device employed on the receiving end of a conference call. The comfort noise level may be customized to an appropriate and comfortable level for the type of listening device being used, and the system may continuously mix the comfort noise with incoming audio signals for the entire duration of a conference call, lowering the comfort noise level gradually during speaking periods for additional user experience improvement.
    Type: Grant
    Filed: June 28, 2011
    Date of Patent: November 19, 2013
    Assignee: Microsoft Corporation
    Inventors: Hosam Khalil, Xiaoqin Sun, Hong Wang Sodoma, Warren Lam
  • Patent number: 8583428
    Abstract: Described is a multiple phase process/system that combines spatial filtering with regularization to separate sound from different sources such as the speech of two different speakers. In a first phase, frequency domain signals corresponding to the sensed sounds are processed into separated spatially filtered signals including by inputting the signals into a plurality of beamformers (which may include nullformers) followed by nonlinear spatial filters. In a regularization phase, the separated spatially filtered signals are input into an independent component analysis mechanism that is configured with multi-tap filters, followed by secondary nonlinear spatial filters. Separated audio signals are the provided via an inverse-transform.
    Type: Grant
    Filed: June 15, 2010
    Date of Patent: November 12, 2013
    Assignee: Microsoft Corporation
    Inventors: Ivan Tashev, Lae-Hoon Kim, Alejandro Acero, Jason Scott Flaks
  • Patent number: 8583427
    Abstract: A signal processing system which discriminates between voice signals and data signals modulated by a voiceband carrier. The signal processing system includes a voice exchange, a data exchange and a call discriminator. The voice exchange is capable of exchanging voice signals between a switched circuit network and a packet based network. The signal processing system also includes a data exchange capable of exchanging data signals modulated by a voiceband carrier on the switched circuit network with unmodulated data signal packets on the packet based network. The data exchange is performed by demodulating data signals from the switched circuit network for transmission on the packet based network, and modulating data signal packets from the packet based network for transmission on the switched circuit network. The call discriminator is used to selectively enable the voice exchange and data exchange.
    Type: Grant
    Filed: January 25, 2010
    Date of Patent: November 12, 2013
    Assignee: Broadcom Corporation
    Inventors: Onur Tackin, Scott Branden
  • Patent number: 8577674
    Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: December 12, 2012
    Date of Patent: November 5, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 8577675
    Abstract: In one aspect thereof the invention provides a method for noise suppression of a speech signal that includes, for a speech signal having a frequency domain representation dividable into a plurality of frequency bins, determining a value of a scaling gain for at least some of said frequency bins and calculating smoothed scaling gain values. Calculating smoothed scaling gain values includes, for the at least some of the frequency bins, combining a currently determined value of the scaling gain and a previously determined value of the smoothed scaling gain. In another aspect a method partitions the plurality of frequency bins into a first set of contiguous frequency bins and a second set of contiguous frequency bins having a boundary frequency there between, where the boundary frequency differentiates between noise suppression techniques, and changes a value of the boundary frequency as a function of the spectral content of the speech signal.
    Type: Grant
    Filed: December 22, 2004
    Date of Patent: November 5, 2013
    Assignee: Nokia Corporation
    Inventor: Milan Jelinek
  • Patent number: 8560301
    Abstract: A language expression apparatus and a method based on a context and a intent awareness, are provided. The apparatus and method may recognize a context and an intent of a user and may generate a language expression based on the recognized context and the recognized intent, thereby providing an interpretation/translation service and/or providing an education service for learning a language.
    Type: Grant
    Filed: March 2, 2010
    Date of Patent: October 15, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Yeo Jin Kim
  • Patent number: 8554560
    Abstract: Discrimination between two classes comprises receiving a set of frames including an input signal and determining at least two different feature vectors for each of the frames. Discrimination between two classes further comprises classifying the two different feature vectors using sets of preclassifiers trained for at least two classes of events and from that classification, and determining values for at least one weighting factor. Discrimination between two classes still further comprises calculating a combined feature vector for each of the received frames by applying the weighting factor to the feature vectors and classifying the combined feature vector for each of the frames by using a set of classifiers trained for at least two classes of events.
    Type: Grant
    Filed: September 4, 2012
    Date of Patent: October 8, 2013
    Assignee: International Business Machines Corporation
    Inventor: Zica Valsan
  • Patent number: 8554547
    Abstract: A voice activity detection method and apparatus, and an electronic device are provided. The method includes: obtaining a time domain parameter and a frequency domain parameter from an audio frame; obtaining a first distance between the time domain parameter and a long-term-sliding mean of the time domain parameter in a history background noise frame, and obtaining a second distance between the frequency domain parameter and a long-term-sliding mean of the frequency domain parameter in the history background noise frame; and judging whether the audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance and a set of decision inequalities based on the first distance and the second distance. The above technical solutions enable the judgment criterion to have an adaptive adjustment capability, thus improving the performance of the voice activity detection.
    Type: Grant
    Filed: July 11, 2012
    Date of Patent: October 8, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Zhe Wang
  • Patent number: 8554564
    Abstract: A rule-based end-pointer isolates spoken utterances contained within an audio stream from background noise and non-speech transients. The rule-based end-pointer includes a plurality of rules to determine the beginning and/or end of a spoken utterance based on various speech characteristics. The rules may analyze an audio stream or a portion of an audio stream based upon an event, a combination of events, the duration of an event, or a duration relative to an event. The rules may be manually or dynamically customized depending upon factors that may include characteristics of the audio stream itself, an expected response contained within the audio stream, or environmental conditions.
    Type: Grant
    Filed: April 25, 2012
    Date of Patent: October 8, 2013
    Assignee: QNX Software Systems Limited
    Inventors: Phil Hetherington, Alex Escott
  • Patent number: 8543391
    Abstract: Disclosed is a method of improving a sound quality, including: receiving a transmission signal of a first user equipment; removing noise in the transmission signal using noise information of the first user equipment side; performing speech reinforcement with respect to the noise removed transmission signal using noise information of a second user equipment side; and transmitting the speech reinforced transmission signal to the second user equipment.
    Type: Grant
    Filed: September 12, 2011
    Date of Patent: September 24, 2013
    Assignee: Industry-Academic Cooperation Foundation, Yonsei University
    Inventors: Hong Goo Kang, Min Seok Choi, Ho Seon Shin
  • Patent number: 8532986
    Abstract: A speech signal evaluation apparatus includes: an acquisition unit that acquires, as a first frame, a speech signal of a specified length from speech signals; a first detection unit that detects, on the basis of a speech condition, whether the first frame is voiced or unvoiced; a variation calculation unit that, when the first frame is unvoiced, calculates a variation in a spectrum associated with the first frame on the basis of a spectrum of the first frame and a spectrum of a second frame that is unvoiced and precedes the first frame in time; and a second detection unit that detects, on the basis of a non-stationary condition based on the variation in spectrum, whether the variation of the first frame satisfies the non-stationary condition.
    Type: Grant
    Filed: March 24, 2010
    Date of Patent: September 10, 2013
    Assignee: Fujitsu Limited
    Inventor: Chikako Matsumoto
  • Patent number: 8532989
    Abstract: A command recognition device includes: an utterance understanding unit that determines or selects word sequence information from speech information; speech confidence degree calculating unit that calculates degree of speech confidence based on the speech information and the word sequence information; a phrase confidence degree calculating unit that calculates a degree of phrase confidence based on image information and phrase information included in the word sequence information; and a motion control instructing unit that determines whether a command of the word sequence information should be executed based on the degree of speech confidence and the degree of phrase confidence.
    Type: Grant
    Filed: September 2, 2010
    Date of Patent: September 10, 2013
    Assignee: Honda Motor Co., Ltd.
    Inventors: Kotaro Funakoshi, Mikio Nakano, Xiang Zuo, Naoto Iwahashi, Ryo Taguchi
  • Patent number: 8504358
    Abstract: In a voice recording equipment and method, voice data from a speaker is received using a microphone. Threshold values T1 and T2 of surrounding environment of the voice recording equipment are determined. If an intensity of the voice data is less than or equal to the threshold value T2, the voice recording is stopped and the speaker is informed that the voice data is not useful. If the intensity of the voice data is greater than the threshold values, the voice data is stored into a storage unit.
    Type: Grant
    Filed: October 28, 2010
    Date of Patent: August 6, 2013
    Assignees: Ambit Microsystems (Shanghai) Ltd., Hon Hai Precision Industry Co., Ltd.
    Inventors: Hong Kang, Guo-Zhi Ding, Chi-Ming Lu
  • Patent number: 8478595
    Abstract: A fundamental frequency pattern generation apparatus includes a first storage including representative vectors each corresponding to a prosodic control unit and having a section for changing the number of phonemes, a second storage unit including a rule to select a vector corresponding to an input context, a selection unit configured to select a vector from the representative vectors by applying the rule to the context and output the selected vector, a calculation unit configured to calculate an expansion/contraction ratio of the section of the selected vector in a time-axis direction based on a designated value for a specific feature amount related to a length of a fundamental frequency pattern to be generated, the designated value of the feature amount being required of the fundamental frequency pattern to be generated, and an expansion/contraction unit configured to expand/contract the selected vector based on the expansion/contraction ratio to generate the fundamental frequency pattern.
    Type: Grant
    Filed: September 5, 2008
    Date of Patent: July 2, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Nobuaki Mizutani
  • Patent number: 8463600
    Abstract: A system and method for automatically adjusting floor controls based on conversational characteristics is provided. Audio streams are received, which each originate from an audio source. Floor controls for a current configuration including at least a portion of the audio streams are maintained. Conversational characteristics shared by two or more of the audio sources are determined. Possible configurations for the audio streams are identified based on the conversational characteristics. An analysis of the current configuration and the possible configurations is performed. A change threshold comprising a minimum number of timeslices for at least one of the current configuration and one of the possible configurations is applied to the analysis. When the analysis satisfies the change threshold, the floor controls are automatically adjusted. The audio streams are mixed into one or more outputs based on the adjusted floor controls.
    Type: Grant
    Filed: February 27, 2012
    Date of Patent: June 11, 2013
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Paul Masami Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison Gyle Woodruff
  • Patent number: 8452591
    Abstract: A device comprising an audio information processor to receive at least one audio stream encoded according to a first protocol by a remote network processing device, the audio stream having associated comfort noise information to indicate a level of background noise available for presentation during silence periods associated with the audio stream, the audio information processor to decode the received audio stream according to the first protocol and to encode the decoded audio stream according to a second protocol, and a background noise translator to convert the comfort noise information received with the audio stream into a format compatible with the second protocol.
    Type: Grant
    Filed: April 11, 2008
    Date of Patent: May 28, 2013
    Assignee: Cisco Technology, Inc.
    Inventors: Herbert Wildfeuer, Robert Simon
  • Patent number: 8438016
    Abstract: A client for silence-based adaptive real-time voice and video (SAVV) transmission methods and systems, detects the activity of a voice stream of conversational speech and aggressively transmits the corresponding video frames if silence in the sending or receiving voice stream has been detected, and adaptively generates and transmits key frames of the video stream according to characteristics of the conversational speech. In one aspect, a coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice encoder of the SAVV client and the user's instructions. In another aspect, the coordination management module generates video frames, segmentation and transmission strategies according to feedback from a voice decoder of the SAVV client and the user's instructions. In one example, the coordination management module adaptively generates a key video frame when silence is detected in the receiving voice stream.
    Type: Grant
    Filed: April 10, 2008
    Date of Patent: May 7, 2013
    Assignee: City University of Hong Kong
    Inventors: Weijia Jia, Lizhuo Zhang, Huan Li, Wenyan Lu
  • Patent number: 8432935
    Abstract: Techniques are presented herein to provide tandem-free operation between two wireless terminals through two otherwise incompatible wireless networks. Specifically, embodiments provide tandem-free operation between a wireless terminal communicating through a continuous transmission (CTX) wireless channel to a wireless terminal communicating through a discontinuous transmission (DTX) wireless channel. In a first aspect, inactive speech frames are translated between DTX and CTX formats. In a second aspect, each wireless terminal includes an active speech decoder that is compatible with the active speech encoder on the opposite end of the mobile-to-mobile connection.
    Type: Grant
    Filed: July 29, 2008
    Date of Patent: April 30, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Khaled El-Maleh, Ananthapadmanabhan A. Kandhadai, Sharath Manjunath
  • Patent number: 8417524
    Abstract: Analyzing an audio interaction is provided. At least one change in an emotion of a speaker in an audio interaction and at least one aspect of the audio interaction are identified. The at least one change in an emotion is analyzed in conjunction with the at least one aspect to determine a relationship between the at least one change in an emotion and the at least one aspect, and a result of the analysis is provided.
    Type: Grant
    Filed: February 11, 2010
    Date of Patent: April 9, 2013
    Assignee: International Business Machines Corporation
    Inventors: Om D. Deshmukh, Chitra Dorai, Shailesh Joshi, Maureen E. Rzasa, Ashish Verma, Karthik Visweswariah, Gary J. Wright, Sai Zeng
  • Patent number: 8380495
    Abstract: The embodiments of a transcoding method, a transcoding device, and a communication apparatus are provided. The embodiment of a method includes: receiving a bit stream input from a sending end; determining an attribute of discontinuous transmission (DTX) used by a receiving end and a frame type of the input bit stream; and transcoding the input bit stream in a corresponding processing manner according to a determination result. Thereby, a corresponding transcoding operation is performed on the input bit stream according to the attribute of DTX used by the receiving end and the frame type of the input bit stream. In such a manner, input bit streams of various types can be processed, and the input bit streams can be correspondingly transcoded according to the requirements of the receiving end. Therefore, the average computational complexity and peak computational complexity can be effectively decreased without decreasing the quality of the synthesized speech.
    Type: Grant
    Filed: January 21, 2010
    Date of Patent: February 19, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Changchun Bao, Hao Xu, Fanrong Tang, Xiangyu Hu
  • Patent number: 8380494
    Abstract: The method and system disclosed herein reduces total bandwidth requirement for communication in a voice over Internet protocol application. Sample [101] and convert [102] the analog input audio signal into digital signals and derive sampled frames [103]. Compute spacings of order statistics [104]. Measure the entropy for each of the sampled frames [105]. Set a threshold for entropy [106]. Mark the audio frames as active speech frames or inactive speech frames [107]. Mark an audio frame as an' inactive speech frame when the entropy is greater than the threshold, and mark the audio frame as an active speech frame when the entropy is lesser than the threshold [107]. Transmit only the active speech frames [108].
    Type: Grant
    Filed: January 24, 2007
    Date of Patent: February 19, 2013
    Assignee: P.E.S. Institute of Technology
    Inventors: Muralishankar Rangarao, Vijay Satyanarayana Rao, Venkatesha Prasad Rangarao, Shankar Hebbale Narasimhiah
  • Patent number: 8374852
    Abstract: Disclosed is a code conversion method to convert a first code sequence conforming to a first speech coding scheme into a second code sequence conforming to a second speech coding scheme. The method includes the following steps. The first step discriminates whether the first code sequence corresponds to a speech part or to a non-speech part, and generates a numerical value that indicates the discrimination result as a control flag. The second step converts the first code sequence into the second code sequence and outputs said second code sequence, when the value of the control flag corresponds to the speech part. The third step outputs the second code sequence that corresponds to the value of the control flag, when the value of the control flag corresponds to the non-speech part.
    Type: Grant
    Filed: March 16, 2006
    Date of Patent: February 12, 2013
    Assignee: NEC Corporation
    Inventor: Atsushi Murashima
  • Patent number: 8374860
    Abstract: Encoding audio signals with selecting an encoding mode for encoding the signal categorizing the signal into active segments having voice activity and non-active segments having substantially no voice activity by using categorization parameters depending on the selected encoding mode and encoding at least the active segments using the selected encoding mode.
    Type: Grant
    Filed: September 29, 2011
    Date of Patent: February 12, 2013
    Assignee: Core Wireless Licensing S.A.R.L.
    Inventors: Kari Jarvinen, Pasi Ojala, Ari Lakaniemi
  • Patent number: 8370132
    Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.
    Type: Grant
    Filed: November 21, 2005
    Date of Patent: February 5, 2013
    Assignee: Verizon Services Corp.
    Inventor: Adrian E. Conway