Voiced Or Unvoiced Patents (Class 704/214)

Automated detection and filtering of audio advertisements

Patent number: 9183177

Abstract: Methods, apparatuses, and media for filtering a data stream are provided. The data stream is partitioned into a plurality of data stream segments. An acoustic parameter of each of the data stream segments is measured, and it is determined whether the acoustic parameter of each of the data stream segments satisfies a predetermined condition. Extraneous segments of the data stream segments are identified in which the predetermined condition is satisfied, and it is determined whether the extraneous segments have a predetermined relationship in the data stream. The extraneous segments are deleted from the data stream to produce a filtered data stream in response to the extraneous segments having the predetermined relationship.

Type: Grant

Filed: April 22, 2013

Date of Patent: November 10, 2015

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Yeon-Jun Kim, I. Dan Melamed, Steven Neil Tischer, Bernard S. Renger
Voice enabled remote control for a set-top box

Patent number: 9135809

Abstract: A remote control device includes a digital audio storage device, a talk button, and an optical distance measurer. The digital audio storage device is configured to continually record an audio input for a specific amount of time. The talk button is coupled to the digital audio storage device and is configured to initiate a transmission of the audio input to a set-top box device. The optical distance measurer is coupled to the talk button and is configured to automatically measure a distance to a user in response to the talk button being pressed.

Type: Grant

Filed: June 20, 2008

Date of Patent: September 15, 2015

Assignee: AT&T Intellectual Property I, LP

Inventors: Hisao M. Chang, Iker Arizmendi
Voice activity detection system, method, and program product

Patent number: 9070375

Abstract: A voice activity detection method in a low SNR environment. The voice activity detection is performed by extracting a long-term spectrum variation component and a harmonic structure as feature vectors from a speech signal and increasing difference in feature vectors between speech and non-speech (i) using the long-term spectrum variation component feature or (ii) using a long-term spectrum variation component extraction and a harmonic structure feature extraction. A correct rate and an accuracy rate of the voice activity detection is improved over conventional methods by using a long-term spectrum variation component having a window length over an average phoneme duration of an utterance in the speech signal. The voice activity detection system and method provides speech processing, automatic speech recognition, and speech output capable of very accurate voice activity detection.

Type: Grant

Filed: February 27, 2009

Date of Patent: June 30, 2015

Assignee: International BUsiness Machines Corporation

Inventors: Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura
Voice activity detection/silence suppression system

Patent number: 9009034

Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.

Type: Grant

Filed: November 12, 2014

Date of Patent: April 14, 2015

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Bing Chen, James H. James
Program endpoint time detection apparatus and method, and program information retrieval system

Patent number: 9009054

Abstract: This invention relates to retrieval for multimedia content, and provides a program endpoint time detection apparatus for detecting an endpoint time of a program by performing processing on audio signals of said program, comprising an audio classification unit for classifying said audio signals into a speech signal portion and a non-speech signal portion; a keyword retrieval unit for retrieving, as a candidate endpoint keyword, an endpoint keyword indicating start or end of the program from said speech signal portion; a content analysis unit for performing content analysis on context of the candidate endpoint keyword retrieved by the keyword retrieval unit to determine whether the candidate endpoint keyword is a valid endpoint keyword; and a program endpoint time determination unit for performing statistics analysis based on the retrieval result of said keyword retrieval unit and the determination result of said content analysis unit, and determining the endpoint time of the program.

Type: Grant

Filed: October 28, 2010

Date of Patent: April 14, 2015

Assignees: Sony Corporation, Institute of Acoustics, Chinese Academy of Sciences

Inventors: Kun Liu, Weiguo Wu, Li Lu, Qingwei Zhao, Yonghong Yan, Hongbin Suo
Methods and systems for synchronizing media

Patent number: 8996380

Abstract: Systems and methods of synchronizing media are provided. A client device may be used to capture a sample of a media stream being rendered by a media rendering source. The client device sends the sample to a position identification module to determine a time offset indicating a position in the media stream corresponding to the sampling time of the sample, and optionally a timescale ratio indicating a speed at which the media stream is being rendered by the media rendering source based on a reference speed of the media stream. The client device calculates a real-time offset using a present time, a timestamp of the media sample, the time offset, and optionally the timescale ratio. The client device then renders a second media stream at a position corresponding to the real-time offset to be in synchrony to the media stream being rendered by the media rendering source.

Type: Grant

Filed: May 4, 2011

Date of Patent: March 31, 2015

Assignee: Shazam Entertainment Ltd.

Inventors: Avery Li-Chun Wang, Rahul Powar, William Michael Mills, Christopher Jacques Penrose Barton, Philip Georges Inghelbrecht, Dheeraj Shankar Mukherjee
Artifact reduction in time compression

Patent number: 8996389

Abstract: Various techniques are disclosed for reducing artifacts generated by time compression. by adapting the time compression based on the state of the received audio. The amount of time compression may be bounded based on audio characteristics. Another feature provides a way of determining the most correlated portions of segments of audio. Voiced speech may be distinguished from unvoiced speech. Another feature provides a way of distinguishing between silence, voiced speech, and unvoiced speech. Time compression may be adapted during periods of lengthy silence. Another feature allows for reducing time compression during sensitive portions of the received audio. One or more of these features may be present in different embodiments.

Type: Grant

Filed: June 14, 2011

Date of Patent: March 31, 2015

Assignee: Polycom, Inc.

Inventor: Eric David Elias
Automatic calibration of command-detection thresholds

Patent number: 8990079

Abstract: When a voice-activated device or application is first started, the signal levels corresponding to spoken commands are initially unknown, making it difficult to set detection thresholds. The inventive method provides an initial command-detection threshold based on the noise level alone. The first command is detected using this initial threshold. Then the threshold is revised according to the first command sound, and a second command is detected using the revised threshold. After detecting each command, the detection threshold is further refined according to the current noise and command sounds. Methods are also disclosed for optimizing the thresholds, adjusting parameters according to sound, and detecting voiced and unvoiced sounds separately. These capabilities enable many emerging voice-activated products and applications.

Type: Grant

Filed: September 17, 2014

Date of Patent: March 24, 2015

Assignee: Zanavox

Inventor: David Edward Newman
Coding and decoding a transient frame

Patent number: 8990094

Abstract: An electronic device for coding a transient frame is described. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor. The electronic device obtains a current transient frame. The electronic device also obtains a residual signal based on the current transient frame. Additionally, the electronic device determines a set of peak locations based on the residual signal. The electronic device further determines whether to use a first coding mode or a second coding mode for coding the current transient frame based on at least the set of peak locations. The electronic device also synthesizes an excitation based on the first coding mode if the first coding mode is determined. The electronic device also synthesizes an excitation based on the second coding mode if the second coding mode is determined.

Type: Grant

Filed: September 8, 2011

Date of Patent: March 24, 2015

Assignee: QUALCOMM Incorporated

Inventors: Venkatesh Krishnan, Ananthapadmanabhan Arasanipalai Kandhadai
Conferencing system, server, image display method, and computer program product

Patent number: 8984061

Abstract: The conferencing system is composed of computers, a moderator's computer, and a projector connected on a network. The moderator's computer receives image data from the computers, and generates synthesized image data therefrom, which is transmitted to the projector for display of the synthesized image. The moderator's computer has the capability to switch the image being projected by the projector from the synthesized image to an image handled by one of the computers or by the moderator's computer. With such an arrangement, utilizing existing hardware resources it will be possible to display in a single split-screen display the images handled by the terminals connected on the network. Additionally, it will be possible to switch smoothly between on-screen displays, and to reduce the burden on the on-screen display operator in a networked conferencing system.

Type: Grant

Filed: August 6, 2008

Date of Patent: March 17, 2015

Assignee: Seiko Epson Corporation

Inventor: Noboru Inoue
System for spectrum sensing of multi-carrier signals with equidistant sub-carriers

Patent number: 8982971

Abstract: A multi-carrier signal is typically comprised of many equidistant sub-carriers. This results in periodicity of spectrum within the bandwidth of such a multi-carrier signal. An unknown multi-carrier signal with equidistant sub-carriers can thus be sensed together with its sub-carrier spacing by finding a discernable local maximum in the cepstrum (Fourier transform of the log spectrum) of the multi-carrier signal.

Type: Grant

Filed: March 29, 2012

Date of Patent: March 17, 2015

Assignee: QRC, Inc.

Inventors: Sinisa Peric, Thomas F. Callahan, III
Unvoiced/Voiced Decision for Speech Processing

Publication number: 20150073783

Abstract: In accordance with an embodiment of the present invention, a method for speech processing includes determining an unvoicing/voicing parameter reflecting a characteristic of unvoiced/voicing speech in a current frame of a speech signal comprising a plurality of frames. A smoothed unvoicing/voicing parameter is determined to include information of the unvoicing/voicing parameter in a frame prior to the current frame of the speech signal. A difference between the unvoicing/voicing parameter and the smoothed unvoicing/voicing parameter is computed. The method further includes generating an unvoiced/voiced decision point for determining whether the current frame comprises unvoiced speech or voiced speech using the computed difference as a decision parameter.

Type: Application

Filed: September 3, 2014

Publication date: March 12, 2015

Inventor: Yang Gao
Method for spectrum sensing of multi-carrier signals with equidistant sub-carriers

Patent number: 8976906

Abstract: A multi-carrier signal is typically comprised of many equidistant sub-carriers. This results in periodicity of spectrum within the bandwidth of such a multi-carrier signal. An unknown multi-carrier signal with equidistant sub-carriers can thus be sensed together with its sub-carrier spacing by finding a discernible local maximum in the cepstrum (Fourier transform of the log spectrum) of the multi-carrier signal.

Type: Grant

Filed: March 29, 2012

Date of Patent: March 10, 2015

Assignee: QRC, Inc.

Inventors: Sinisa Peric, Thomas F. Callahan, III
Coding with noise shaping in a hierarchical coder

Patent number: 8965773

Abstract: A method is provided for hierarchical coding of a digital audio signal comprising, for a current frame of the input signal: a core coding, delivering a scalar quantization index for each sample of the current frame and at least one enhancement coding delivering indices of scalar quantization for each coded sample of an enhancement signal. The enhancement coding comprises a step of obtaining a filter for shaping the coding noise used to determine a target signal and in that the indices of scalar quantization of said enhancement signal are determined by minimizing the error between a set of possible values of scalar quantization and said target signal. The coding method can also comprise a shaping of the coding noise for the core bitrate coding. A coder implementing the coding method is also provided.

Type: Grant

Filed: November 17, 2009

Date of Patent: February 24, 2015

Assignee: Orange

Inventors: Balazs Kovesi, Stéphane Ragot, Alain Le Guyader
System and method for automatic identification of speech coding scheme

Patent number: 8959025

Abstract: Methods and systems for extracting speech from such packet streams. The methods and systems analyze the encoded speech in a given packet stream, and automatically identify the actual speech coding scheme that was used to produce it. These techniques may be used, for example, in interception systems where the identity of the actual speech coding scheme is sometimes unavailable or inaccessible. For instance, the identity of the actual speech coding scheme may be sent in a separate signaling stream that is not intercepted. As another example, the identity of the actual speech coding scheme may be sent in the same packet stream as the encoded speech, but in encrypted form.

Type: Grant

Filed: April 28, 2011

Date of Patent: February 17, 2015

Assignee: Verint Systems Ltd.

Inventor: Genady Malinsky
Noise suppression in a Mel-filtered spectral domain

Patent number: 8942975

Abstract: Techniques are described herein that suppress noise in a Mel-filtered spectral domain. For example, a window may be applied to a representation of a speech signal in a time domain. The windowed representation in the time domain may be converted to a subsequent representation of the speech signal in the Mel-filtered spectral domain. A noise suppression operation may be performed with respect to the subsequent representation to provide noise-suppressed Mel coefficients.

Type: Grant

Filed: March 22, 2011

Date of Patent: January 27, 2015

Assignee: Broadcom Corporation

Inventor: Jonas Borgstrom
Voice conversion method and system

Patent number: 8930183

Abstract: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.

Type: Grant

Filed: August 25, 2011

Date of Patent: January 6, 2015

Assignee: Kabushiki Kaisha Toshiba

Inventors: Byung Ha Chun, Mark John Francis Gales
Signal bandwidth extending apparatus

Patent number: 8930184

Abstract: A signal bandwidth extending apparatus including: a bandwidth extending section configured to extend a frequency bandwidth of a target signal, the target signal included in an input signal; a calculating section configured to calculate a degree of the target signal included in the input signal; and a controller configured to change a method of extending the frequency bandwidth by the bandwidth extending section according to a result of the calculating section.

Type: Grant

Filed: September 14, 2009

Date of Patent: January 6, 2015

Assignee: Kabushiki Kaisha Toshiba

Inventors: Takashi Sudo, Masataka Osada
Audio signal bandwidth extension in CELP-based speech coder

Patent number: 8924200

Abstract: A method for decoding an audio signal in a decoder having a CELP-based decoder element including a fixed codebook component, at least one pitch period value, and a first decoder output, wherein a bandwidth of the audio signal extends beyond a bandwidth of the CELP-based decoder element. The method includes obtaining an up-sampled fixed codebook signal by up-sampling the fixed codebook component to a higher sample rate, obtaining an up-sampled excitation signal based on the up-sampled fixed codebook signal and an up-sampled pitch period value, and obtaining a composite output signal based on the up-sampled excitation signal and an output signal of the CELP-based decoder element, wherein the composite output signal includes a bandwidth portion that extends beyond a bandwidth of the CELP-based decoder element.

Type: Grant

Filed: September 28, 2011

Date of Patent: December 30, 2014

Assignee: Motorola Mobility LLC

Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
Identifying spoken commands by templates of ordered voiced and unvoiced sound intervals

Patent number: 8924209

Abstract: A method is disclosed for identifying a spoken command by detecting intervals of voiced and unvoiced sound, and then comparing the order of voiced and unvoiced sounds to a set of templates. Each template represents one of the predetermined acceptable commands of the application, and is associated with a predetermined action. When the order of voiced and unvoiced intervals in the spoken command matches the order in one of the templates, the associated action is thus selected. Silent intervals in the command may also be included for enhanced recognition. Efficient protocols are disclosed for discriminating voiced and unvoiced sounds, and for detecting the beginning and ending of each sound interval in the command, and for comparing the command sequence to the templates. In a sparse-command application, this method provides fast and robust recognition, and can be implemented with low-cost hardware and extremely minimal software.

Type: Grant

Filed: September 12, 2012

Date of Patent: December 30, 2014

Assignee: Zanavox

Inventor: David Edward Newman
Method and apparatus for wind noise detection and suppression using multiple microphones

Patent number: 8924204

Abstract: Unlike sound based pressure waves that go everywhere, air turbulence caused by wind is usually a fairly local event. Therefore, in a system that utilizes two or more spatially separated microphones to pick up sound signals (e.g., speech), wind noise picked up by one of the microphones often will not be picked up (or at least not to the same extent) by the other microphone(s). Embodiments of methods and apparatuses that utilize this fact and others to effectively detect and suppress wind noise using multiple microphones that are spatially separated are described.

Type: Grant

Filed: September 30, 2011

Date of Patent: December 30, 2014

Assignee: Broadcom Corporation

Inventors: Juin-Hwey Chen, Jes Thyssen, Xianxian Zhang, Huaiyu Zeng
Voice activity detection/silence suppression system

Patent number: 8909519

Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.

Type: Grant

Filed: March 10, 2014

Date of Patent: December 9, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Bing Chen, James H. James
Systems, methods, and apparatus for voice activity detection

Patent number: 8898058

Abstract: Systems, methods, apparatus, and machine-readable media for voice activity detection in a single-channel or multichannel audio signal are disclosed.

Type: Grant

Filed: October 24, 2011

Date of Patent: November 25, 2014

Assignee: QUALCOMM Incorporated

Inventors: Jongwon Shin, Erik Visser, Ian Ernan Liu
Speech recognition system to evaluate speech signals, method thereof, and storage medium storing the program for speech recognition to evaluate speech signals

Patent number: 8886527

Abstract: A purpose is to suppress recognition process delay generated due to load in signal processing. Included is a speech input means 10 that inputs a speech signal, an output evaluation means 20 that evaluates whether or not the speech signal input by the speech input means 10 is the speech signal in a sound section, which is a speech section assuming that a speaker is speaking, and outputs the speech signal as a speech signal to be processed only when evaluated as the speech signal in the sound section, a signal processing means 30 that performs signal processing to the speech signal, which is output by the output evaluation means 20 as the speech signal to be processed, and a speech recognition processing means 40 that performs a speech recognition process to the speech signal which is signal-processed by the signal processing means 30.

Type: Grant

Filed: April 16, 2009

Date of Patent: November 11, 2014

Assignee: NEC Corporation

Inventor: Toru Iwasawa
Method and apparatus to evaluate quality of audio signal

Patent number: 8879762

Abstract: A method and apparatus to evaluate a quality of an audio signal, in which the number of effective channels is determined for each of a reference signal of a current frame and a test signal indicative of the reference signal that has passed through an audio codec, and an audio quality evaluation score of the current frame is calculated by evaluating an audio quality of the current frame based on the determined number of effective channels for each of the reference signal and the test signal by means of a predetermined evaluator.

Type: Grant

Filed: January 28, 2010

Date of Patent: November 4, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventor: In-Yong Choi
System and method for automatic temporal alignment between music audio signal and lyrics

Patent number: 8880409

Abstract: A system provided herein may perform automatic temporal alignment between music audio signal and lyrics with higher accuracy than ever. A non-fricative section extracting 4 extracts non-fricative sound sections, where no fricative sounds exist, from the music audio signal. An alignment portion 17 includes a phone model 15 for singing voice capable of estimating phonemes corresponding to temporal-alignment features. The alignment portion 17 performs an alignment operation using as inputs temporal-alignment features obtained from a temporal-alignment feature extracting portion 11, information on vocal and non-vocal sections obtained from a vocal section estimating portion 9, and a phoneme network SN on conditions that no phonemes exist at least in non-vocal sections and that no fricative phonemes exist in non-fricative sound sections.

Type: Grant

Filed: February 5, 2009

Date of Patent: November 4, 2014

Assignee: National Institute of Advanced Industrial Science and Technology

Inventors: Hiromasa Fujihara, Masataka Goto
Audio signal bandwidth extension in CELP-based speech coder

Patent number: 8868432

Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.

Type: Grant

Filed: September 28, 2011

Date of Patent: October 21, 2014

Assignee: Motorola Mobility LLC

Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
Determining pitch cycle energy and scaling an excitation signal

Patent number: 8862465

Abstract: An electronic device for determining a set of pitch cycle energy parameters is described. The electronic device includes a processor and executable instructions stored in memory. The electronic device obtains a frame, a set of filter coefficients and a residual signal based on the frame and the set of filter coefficients. The electronic device determines a set of peak locations based on the residual signal and segments the residual signal such that each segment includes one peak. The electronic device determines a first set of pitch cycle energy parameters based on a frame region between two consecutive peak locations and maps regions between peaks in the residual signal to regions between peaks in a synthesized excitation signal to produce a mapping. The electronic device determines a second set of pitch cycle energy parameters based on the first set of pitch cycle energy parameters and the mapping.

Type: Grant

Filed: September 8, 2011

Date of Patent: October 14, 2014

Assignee: QUALCOMM Incorporated

Inventors: Venkatesh Krishnan, Stephane Pierre Villette
Method and system for managing time-sensitive packetized data streams at a receiver

Patent number: 8842534

Abstract: According to one embodiment of the invention, a method for managing time-sensitive packetized data streams at a receiver includes receiving a time-sensitive packet of a data stream, analyzing an energy level of a payload signal of the packet, and determining whether to drop the packet based on the energy level of the payload signal.

Type: Grant

Filed: January 23, 2012

Date of Patent: September 23, 2014

Assignee: Cisco Technology, Inc.

Inventors: Paul S. Hahn, Michael E. Knappe, Richard A. Dunlap, Luke K. Surazski
Real time generation of audio content summaries

Patent number: 8825478

Abstract: Audio content is converted to text using speech recognition software. The text is then associated with a distinct voice or a generic placeholder label if no distinction can be made. From the text and voice information, a word cloud is generated based on key words and key speakers. A visualization of the cloud displays as it is being created. Words grow in size in relation to their dominance. When it is determined that the predominant words or speakers have changed, the word cloud is complete. That word cloud continues to be displayed statically and a new word cloud display begins based upon a new set of predominant words or a new predominant speaker or set of speakers. This process may continue until the meeting is concluded. At the end of the meeting, the completed visualization may be saved to a storage device, sent to selected individuals, removed, or any combination of the preceding.

Type: Grant

Filed: January 10, 2011

Date of Patent: September 2, 2014

Assignee: Nuance Communications, Inc.

Inventors: Susan Marie Cox, Janani Janakiraman, Fang Lu, Loulwa F Salem
Accurate fast forward rate when performing trick play with variable distance between frames

Patent number: 8792777

Abstract: The present invention is directed to system(s), method(s), and apparatus for accurate fast forward rate when performing trick play with variable distance between frames. In one embodiment, there is presented a circuit for providing a fast forward video sequence. The circuit comprises a system time clock for providing a time reference, said time reference incremented at a predetermined fast forward rate; a comparator for comparing the time reference with timing information associated with a picture; and a controller for determining whether to display the picture based at least in part on the comparison between the timing information and the time reference.

Type: Grant

Filed: January 10, 2007

Date of Patent: July 29, 2014

Assignee: Broadcom Corporation

Inventor: Tim Ross
Voiced interval command interpretation

Patent number: 8781821

Abstract: A method is disclosed for controlling a voice-activated device by interpreting a spoken command as a series of voiced and non-voiced intervals. A responsive action is then performed according to the number of voiced intervals in the command. The method is well-suited to applications having a small number of specific voice-activated response functions. Applications using the inventive method offer numerous advantages over traditional speech recognition systems including speaker universality, language independence, no training or calibration needed, implementation with simple microcontrollers, and extremely low cost. For time-critical applications such as pulsers and measurement devices, where fast reaction is crucial to catch a transient event, the method provides near-instantaneous command response, yet versatile voice control.

Type: Grant

Filed: April 30, 2012

Date of Patent: July 15, 2014

Assignee: Zanavox

Inventor: David Edward Newman
Yule walker based low-complexity voice activity detector in noise suppression systems

Patent number: 8775168

Abstract: A Yule-Walker based, low-complexity voice activity detector (VAD) is disclosed. An input signal is typically noisy speech (i.e., corrupted with, for example, babble noise). In one embodiment, a first initialization stage of the VAD computes an occurrence of a silent period within the input signal and the AR parameters. The VAD could accordingly compute a tentative adaptive threshold and output hypothesis H1 (which means speech is present) during this stage. During the second initialization stage, the VAD generally builds a database of associated values and computes the adaptive threshold accordingly. The second initialization stage could also output tentative VAD decisions based on the tentative threshold computed in the first initialization stage. Finally, the VAD periodically retrains or updates AR parameters, threshold values and/or the database and outputs VAD decisions accordingly.

Type: Grant

Filed: August 3, 2007

Date of Patent: July 8, 2014

Assignee: STMicroelectronics Asia Pacific PTE, Ltd.

Inventors: Karthik Muralidhar, Anoop Kumar Krishna
Decoding method and decoding apparatus therefor

Patent number: 8762158

Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.

Type: Grant

Filed: August 5, 2011

Date of Patent: June 24, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
Noise suppression device

Patent number: 8762139

Abstract: A noise suppression device includes: a power spectrum calculator converting an input signal of time domain into power spectra of frequency domain; a voice/noise determination unit determining whether the power spectra indicate voice or noise; a noise spectrum estimation unit estimating noise spectra of the power spectra; a period component estimation unit analyzing a harmonic structure constituting the power spectra and estimating periodical information about the power spectra; a weighting coefficient calculator calculating a weighting coefficient for weighting the power spectra; a suppression coefficient calculator calculating a suppression coefficient for suppressing noise included in the power spectra; a spectrum suppression unit suppressing amplitude of the power spectra in accordance with the suppression coefficient; and an inverse Fourier transformer converting the power spectra output by the spectrum suppression unit into a signal of time domain to generate a noise-suppressed signal.

Type: Grant

Filed: September 21, 2010

Date of Patent: June 24, 2014

Assignee: Mitsubishi Electric Corporation

Inventors: Satoru Furuta, Hirohisa Tasaki
Method and apparatus for voice activity detection

Patent number: 8762144

Abstract: A method and apparatus for detecting voice activity are disclosed. The method of detecting voice activity includes: extracting a feature parameter from a frame signal; determining whether the frame signal is a voice signal or a noise signal by comparing the feature parameter with model parameters of a plurality of comparison signals, respectively; and outputting the frame signal when the frame signal is determined to be a voice signal. The apparatus includes a classifier module which extracts a feature parameter from a frame signal, and generating labeling information with respect to the frame signal by comparing the feature parameter with model parameters of a plurality of comparison signals; and a voice detection unit which determines whether the frame signal is a noise signal or a voice signal with reference to the labeling information, and outputting the frame signal when the frame signal is determined to be a voice signal.

Type: Grant

Filed: May 3, 2011

Date of Patent: June 24, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventors: Nam-gook Cho, Eun-kyoung Kim
Recognition processing of a plurality of streaming voice signals for determination of a responsive action thereto

Patent number: 8751222

Abstract: Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage.

Type: Grant

Filed: May 22, 2009

Date of Patent: June 10, 2014

Assignee: Accenture Global Services Limited Dublin

Inventors: Thomas J. Ryan, Biji K. Janan
Recognition of target words using designated characteristic values

Patent number: 8744839

Abstract: Target word recognition includes: obtaining a candidate word set and corresponding characteristic computation data, the candidate word set comprising text data, and characteristic computation data being associated with the candidate word set; performing segmentation of the characteristic computation data to generate a plurality of text segments; combining the plurality of text segments to form a text data combination set; determining an intersection of the candidate word set and the text data combination set, the intersection comprising a plurality of text data combinations; determining a plurality of designated characteristic values for the plurality of text data combinations; based at least in part on the plurality of designated characteristic values and according to at least a criterion, recognizing among the plurality of text data combinations target words whose characteristic values fulfill the criterion.

Type: Grant

Filed: September 22, 2011

Date of Patent: June 3, 2014

Assignee: Alibaba Group Holding Limited

Inventors: Haibo Sun, Yang Yang, Yining Chen
Speech signal processing device

Patent number: 8738367

Abstract: A speech signal processing device is equipped with a power acquisition unit, a probability distribution acquisition unit, and a correspondence degree determination unit. The power acquisition unit accepts an inputted speech signal and, based on the accepted speech signal, acquires power representing the intensity of a speech sound represented by the speech signal. The probability distribution acquisition unit acquires a probability distribution using the intensity of the power acquired by the power acquisition unit as a random variable. The correspondence degree determination unit determines whether a correspondence degree representing a degree that power acquired by the power acquisition unit in a case that a predetermined reference speech signal is inputted into the power acquisition unit corresponds with predetermined reference power is higher than a predetermined reference correspondence degree, based on the probability distribution acquired by the probability distribution acquisition unit.

Type: Grant

Filed: February 18, 2010

Date of Patent: May 27, 2014

Assignee: NEC Corporation

Inventor: Tadashi Emori
System and method for winding audio content using a voice activity detection algorithm

Patent number: 8731914

Abstract: A system and method for locating a preferable playback start location after a winding or rewinding action in an audio playing device. In response to an adjustment of the playing location for audio content to a desired playing position, the system determines whether at least one non-speech or silent period of at least a predetermined duration exists within the vicinity of the desired playing position. If at least one such non-speech or silent period exists within the vicinity of the desired playing position, the system adjusts the playing position to fall within one of the at least one non-speech period or silent period.

Type: Grant

Filed: November 15, 2005

Date of Patent: May 20, 2014

Assignee: Nokia Corporation

Inventors: Janne Vainio, Hannu J. Mikkola, Jari M. Makinen
Device and method for generating a multi-channel signal including speech signal processing

Patent number: 8731209

Abstract: In order to generate a multi-channel signal having a number of output channels greater than a number of input channels, a mixer is used for upmixing the input signal to form at least a direct channel signal and at least an ambience channel signal. A speech detector is provided for detecting a section of the input signal, the direct channel signal or the ambience channel signal in which speech portions occur. Based on this detection, a signal modifier modifies the input signal or the ambience channel signal in order to attenuate speech portions in the ambience channel signal, whereas such speech portions in the direct channel signal are attenuated to a lesser extent or not at all. A loudspeaker signal outputter then maps the direct channel signals and the ambience channel signals to loudspeaker signals which are associated to a defined reproduction scheme, such as, for example, a 5.1 scheme.

Type: Grant

Filed: October 1, 2008

Date of Patent: May 20, 2014

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Christian Uhle, Oliver Hellmuth, Juergen Herre, Harald Popp, Thorsten Kastner
Apparatus and method for computing control information for an echo suppression filter and apparatus and method for computing a delay value

Patent number: 8731207

Abstract: An embodiment of an apparatus for computing control information for a suppression filter for filtering a second audio signal to suppress an echo based on a first audio signal includes a computer having a value determiner for determining at least one energy-related value for a band-pass signal of at least two temporally successive data blocks of at least one signal of a group of signals. The computer further includes a mean value determiner for determining at least one mean value of the at least one determined energy-related value for the band-pass signal. The computer further includes a modifier for modifying the at least one energy-related value for the band-pass signal on the basis of the determined mean value for the band-pass signal. The computer further includes a control information computer for computing the control information for the suppression filter on the basis of the at least one modified energy-related value.

Type: Grant

Filed: January 12, 2009

Date of Patent: May 20, 2014

Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.

Inventors: Fabian Kuech, Markus Kallinger, Christof Faller, Alexis Favrot
Method and apparatus for element identification in a signal

Patent number: 8725508

Abstract: A computer-implemented method and apparatus for searching for an element sequence, the method comprising: receiving a signal; determining an initial segment of the signal; inputting the initial segment into an element extraction engine to obtain a first element sequence; determining one or more second segments, each of the second segments at least partially overlapping with the initial segment; inputting the second segments into the element extraction engine to obtain at least one second element sequence; and searching for an element subsequence common to at least a predetermined number of sequences of the first element sequence and the second element sequences.

Type: Grant

Filed: March 27, 2012

Date of Patent: May 13, 2014

Assignee: Novospeech

Inventor: Yossef Ben-Ezra
Mobile speech recognition with explicit tone features

Patent number: 8725498

Abstract: A computer-implemented method for digital speech processing, including (1) receiving, at a server computer, digital speech data from a computing device, the digital speech data comprising data points sampled at respective time points; (2) computing, by the server computer, a tonal feature of the digital speech data, the tonal feature comprising information encoding fundamental frequencies at the respective time points; (3) computing, by the server computer, a logarithm of the tonal feature at the respective time points; and (4) processing, by the server computer, the logarithm of the tonal feature based on a characterization of the digital speech data at the respective time points.

Type: Grant

Filed: July 24, 2012

Date of Patent: May 13, 2014

Assignee: Google Inc.

Inventors: Yun-hsuan Sung, Meihong Wang, Xin Lei
Operating method for voice activity detection/silence suppression system

Patent number: 8700390

Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.

Type: Grant

Filed: October 7, 2013

Date of Patent: April 15, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Bing Chen, James H. James
System, method and program for voice detection

Patent number: 8694308

Abstract: A system for voice detection includes a feature value calculation unit that calculates a feature value from an input signal sliced on a per frame basis, a provisional voice/non-voice decision unit that provisionally decides a voiced interval and a non-voiced interval from the feature value calculated on a per frame basis, and a voice/non-voice decision unit that determines a voiced interval duration threshold value or a non-voiced interval duration threshold value, using a ratio of the feature value found on a per frame basis to a threshold value for the feature value and that re-decides the voiced interval and the non-voiced interval, using the voiced interval duration threshold value determined and the non-voiced interval duration threshold value determined.

Type: Grant

Filed: November 26, 2008

Date of Patent: April 8, 2014

Assignee: Nec Corporation

Inventors: Takayuki Arakawa, Masanori Tsujikawa
Communication terminal and communication method

Patent number: 8694326

Abstract: A communication terminal includes a decoder which decodes an input bitstream received from another communication terminal, to generate an output audio signal and outputs the generated output audio signal to a speaker; and an echo canceller which obtains an input audio signal representing sound captured by a microphone placed in a space to which the speaker outputs the sound, and removes, for respective subbands, an echo component included in the obtained input audio signal and corresponding to the output audio signal, to generate an audio signal for transmission. An encoder codes the audio signal for transmission to generate an output bitstream and transmits the generated output bitstream to another communication terminal; and a control unit controls, for the respective subbands, echo cancellation processing according to a reproduction band of at least one of the output audio signal and the audio signal for transmission.

Type: Grant

Filed: August 21, 2012

Date of Patent: April 8, 2014

Assignee: Panasonic Corporation

Inventors: Shuji Miyasaka, Kosuke Nishio, Ichiro Kawashima
Optimized energy and data transfer in hearing implant systems

Patent number: 8687831

Abstract: An external device for a hearing implant system and a hearing implant system having an external device is described. An external transmitter generates a radio-frequency inductive link signal to an implanted receiver including a sequence of data word segments which communicate data to the implanted receiver, and a sequence of data word pause segments between each data word segment which communicate energy without data to the implanted receiver. A data word pause controller controls the inductive link signal during the data word pause segments according to an energy management rule.

Type: Grant

Filed: October 26, 2012

Date of Patent: April 1, 2014

Assignee: Med-El Elektromedizinische Geraete GmbH

Inventors: Martin Stoffaneller, Peter Schleich, Thomas Schwarzenbeck
Generating speech and voice from extracted signal attributes using a speech-locked loop (SLL)

Patent number: 8688438

Abstract: A speech processing system includes a plurality of signal analyzers that extract salient signal attributes of an input voice signal. A difference module computes the differences in the salient signal attributes. One or more control modules control a plurality of speech generators using an output signal from the difference module in a speech-locked loop (SLL), the speech generators use the output signal to generate a voice signal.

Type: Grant

Filed: February 9, 2010

Date of Patent: April 1, 2014

Assignee: Massachusetts Institute of Technology

Inventors: Keng Hoong Wee, Lorenzo Turicchia, Rahul Sarpeshkar
SYSTEM AND METHOD FOR SPEECH SYNTHESIS

Publication number: 20140088958

Abstract: The present invention is a method and system to convert speech signal into a parametric representation in terms of timbre vectors, and to recover the speech signal thereof. The speech signal is first segmented into non-overlapping frames using the glottal closure instant information, each frame is converted into an amplitude spectrum using a Fourier analyzer, and then using Laguerre functions to generate a set of coefficients which constitute a timbre vector. A sequence of timbre vectors can be subject to a variety of manipulations. The new timbre vectors are converted back into voice signals by first transforming into amplitude spectra using Laguerre functions, then generating phase spectra from the amplitude spectra using Kramers-Knonig relations. A Fourier transformer converts the amplitude spectra and phase spectra into elementary acoustic waves, then superposed to become the output voice. The method and system can be used for voice transformation, speech synthesis, and automatic speech recognition.

Type: Application

Filed: December 3, 2012

Publication date: March 27, 2014

Inventor: Chengjun Julian Chen

prev 1 2 3 4 5 6 … next