Voiced Or Unvoiced Patents (Class 704/214)
  • Patent number: 9183177
    Abstract: Methods, apparatuses, and media for filtering a data stream are provided. The data stream is partitioned into a plurality of data stream segments. An acoustic parameter of each of the data stream segments is measured, and it is determined whether the acoustic parameter of each of the data stream segments satisfies a predetermined condition. Extraneous segments of the data stream segments are identified in which the predetermined condition is satisfied, and it is determined whether the extraneous segments have a predetermined relationship in the data stream. The extraneous segments are deleted from the data stream to produce a filtered data stream in response to the extraneous segments having the predetermined relationship.
    Type: Grant
    Filed: April 22, 2013
    Date of Patent: November 10, 2015
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Yeon-Jun Kim, I. Dan Melamed, Steven Neil Tischer, Bernard S. Renger
  • Patent number: 9135809
    Abstract: A remote control device includes a digital audio storage device, a talk button, and an optical distance measurer. The digital audio storage device is configured to continually record an audio input for a specific amount of time. The talk button is coupled to the digital audio storage device and is configured to initiate a transmission of the audio input to a set-top box device. The optical distance measurer is coupled to the talk button and is configured to automatically measure a distance to a user in response to the talk button being pressed.
    Type: Grant
    Filed: June 20, 2008
    Date of Patent: September 15, 2015
    Assignee: AT&T Intellectual Property I, LP
    Inventors: Hisao M. Chang, Iker Arizmendi
  • Patent number: 9070375
    Abstract: A voice activity detection method in a low SNR environment. The voice activity detection is performed by extracting a long-term spectrum variation component and a harmonic structure as feature vectors from a speech signal and increasing difference in feature vectors between speech and non-speech (i) using the long-term spectrum variation component feature or (ii) using a long-term spectrum variation component extraction and a harmonic structure feature extraction. A correct rate and an accuracy rate of the voice activity detection is improved over conventional methods by using a long-term spectrum variation component having a window length over an average phoneme duration of an utterance in the speech signal. The voice activity detection system and method provides speech processing, automatic speech recognition, and speech output capable of very accurate voice activity detection.
    Type: Grant
    Filed: February 27, 2009
    Date of Patent: June 30, 2015
    Assignee: International BUsiness Machines Corporation
    Inventors: Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura
  • Patent number: 9009034
    Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: November 12, 2014
    Date of Patent: April 14, 2015
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 9009054
    Abstract: This invention relates to retrieval for multimedia content, and provides a program endpoint time detection apparatus for detecting an endpoint time of a program by performing processing on audio signals of said program, comprising an audio classification unit for classifying said audio signals into a speech signal portion and a non-speech signal portion; a keyword retrieval unit for retrieving, as a candidate endpoint keyword, an endpoint keyword indicating start or end of the program from said speech signal portion; a content analysis unit for performing content analysis on context of the candidate endpoint keyword retrieved by the keyword retrieval unit to determine whether the candidate endpoint keyword is a valid endpoint keyword; and a program endpoint time determination unit for performing statistics analysis based on the retrieval result of said keyword retrieval unit and the determination result of said content analysis unit, and determining the endpoint time of the program.
    Type: Grant
    Filed: October 28, 2010
    Date of Patent: April 14, 2015
    Assignees: Sony Corporation, Institute of Acoustics, Chinese Academy of Sciences
    Inventors: Kun Liu, Weiguo Wu, Li Lu, Qingwei Zhao, Yonghong Yan, Hongbin Suo
  • Patent number: 8996380
    Abstract: Systems and methods of synchronizing media are provided. A client device may be used to capture a sample of a media stream being rendered by a media rendering source. The client device sends the sample to a position identification module to determine a time offset indicating a position in the media stream corresponding to the sampling time of the sample, and optionally a timescale ratio indicating a speed at which the media stream is being rendered by the media rendering source based on a reference speed of the media stream. The client device calculates a real-time offset using a present time, a timestamp of the media sample, the time offset, and optionally the timescale ratio. The client device then renders a second media stream at a position corresponding to the real-time offset to be in synchrony to the media stream being rendered by the media rendering source.
    Type: Grant
    Filed: May 4, 2011
    Date of Patent: March 31, 2015
    Assignee: Shazam Entertainment Ltd.
    Inventors: Avery Li-Chun Wang, Rahul Powar, William Michael Mills, Christopher Jacques Penrose Barton, Philip Georges Inghelbrecht, Dheeraj Shankar Mukherjee
  • Patent number: 8996389
    Abstract: Various techniques are disclosed for reducing artifacts generated by time compression. by adapting the time compression based on the state of the received audio. The amount of time compression may be bounded based on audio characteristics. Another feature provides a way of determining the most correlated portions of segments of audio. Voiced speech may be distinguished from unvoiced speech. Another feature provides a way of distinguishing between silence, voiced speech, and unvoiced speech. Time compression may be adapted during periods of lengthy silence. Another feature allows for reducing time compression during sensitive portions of the received audio. One or more of these features may be present in different embodiments.
    Type: Grant
    Filed: June 14, 2011
    Date of Patent: March 31, 2015
    Assignee: Polycom, Inc.
    Inventor: Eric David Elias
  • Patent number: 8990079
    Abstract: When a voice-activated device or application is first started, the signal levels corresponding to spoken commands are initially unknown, making it difficult to set detection thresholds. The inventive method provides an initial command-detection threshold based on the noise level alone. The first command is detected using this initial threshold. Then the threshold is revised according to the first command sound, and a second command is detected using the revised threshold. After detecting each command, the detection threshold is further refined according to the current noise and command sounds. Methods are also disclosed for optimizing the thresholds, adjusting parameters according to sound, and detecting voiced and unvoiced sounds separately. These capabilities enable many emerging voice-activated products and applications.
    Type: Grant
    Filed: September 17, 2014
    Date of Patent: March 24, 2015
    Assignee: Zanavox
    Inventor: David Edward Newman
  • Patent number: 8990094
    Abstract: An electronic device for coding a transient frame is described. The electronic device includes a processor and executable instructions stored in memory that is in electronic communication with the processor. The electronic device obtains a current transient frame. The electronic device also obtains a residual signal based on the current transient frame. Additionally, the electronic device determines a set of peak locations based on the residual signal. The electronic device further determines whether to use a first coding mode or a second coding mode for coding the current transient frame based on at least the set of peak locations. The electronic device also synthesizes an excitation based on the first coding mode if the first coding mode is determined. The electronic device also synthesizes an excitation based on the second coding mode if the second coding mode is determined.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: March 24, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Venkatesh Krishnan, Ananthapadmanabhan Arasanipalai Kandhadai
  • Patent number: 8984061
    Abstract: The conferencing system is composed of computers, a moderator's computer, and a projector connected on a network. The moderator's computer receives image data from the computers, and generates synthesized image data therefrom, which is transmitted to the projector for display of the synthesized image. The moderator's computer has the capability to switch the image being projected by the projector from the synthesized image to an image handled by one of the computers or by the moderator's computer. With such an arrangement, utilizing existing hardware resources it will be possible to display in a single split-screen display the images handled by the terminals connected on the network. Additionally, it will be possible to switch smoothly between on-screen displays, and to reduce the burden on the on-screen display operator in a networked conferencing system.
    Type: Grant
    Filed: August 6, 2008
    Date of Patent: March 17, 2015
    Assignee: Seiko Epson Corporation
    Inventor: Noboru Inoue
  • Patent number: 8982971
    Abstract: A multi-carrier signal is typically comprised of many equidistant sub-carriers. This results in periodicity of spectrum within the bandwidth of such a multi-carrier signal. An unknown multi-carrier signal with equidistant sub-carriers can thus be sensed together with its sub-carrier spacing by finding a discernable local maximum in the cepstrum (Fourier transform of the log spectrum) of the multi-carrier signal.
    Type: Grant
    Filed: March 29, 2012
    Date of Patent: March 17, 2015
    Assignee: QRC, Inc.
    Inventors: Sinisa Peric, Thomas F. Callahan, III
  • Publication number: 20150073783
    Abstract: In accordance with an embodiment of the present invention, a method for speech processing includes determining an unvoicing/voicing parameter reflecting a characteristic of unvoiced/voicing speech in a current frame of a speech signal comprising a plurality of frames. A smoothed unvoicing/voicing parameter is determined to include information of the unvoicing/voicing parameter in a frame prior to the current frame of the speech signal. A difference between the unvoicing/voicing parameter and the smoothed unvoicing/voicing parameter is computed. The method further includes generating an unvoiced/voiced decision point for determining whether the current frame comprises unvoiced speech or voiced speech using the computed difference as a decision parameter.
    Type: Application
    Filed: September 3, 2014
    Publication date: March 12, 2015
    Inventor: Yang Gao
  • Patent number: 8976906
    Abstract: A multi-carrier signal is typically comprised of many equidistant sub-carriers. This results in periodicity of spectrum within the bandwidth of such a multi-carrier signal. An unknown multi-carrier signal with equidistant sub-carriers can thus be sensed together with its sub-carrier spacing by finding a discernible local maximum in the cepstrum (Fourier transform of the log spectrum) of the multi-carrier signal.
    Type: Grant
    Filed: March 29, 2012
    Date of Patent: March 10, 2015
    Assignee: QRC, Inc.
    Inventors: Sinisa Peric, Thomas F. Callahan, III
  • Patent number: 8965773
    Abstract: A method is provided for hierarchical coding of a digital audio signal comprising, for a current frame of the input signal: a core coding, delivering a scalar quantization index for each sample of the current frame and at least one enhancement coding delivering indices of scalar quantization for each coded sample of an enhancement signal. The enhancement coding comprises a step of obtaining a filter for shaping the coding noise used to determine a target signal and in that the indices of scalar quantization of said enhancement signal are determined by minimizing the error between a set of possible values of scalar quantization and said target signal. The coding method can also comprise a shaping of the coding noise for the core bitrate coding. A coder implementing the coding method is also provided.
    Type: Grant
    Filed: November 17, 2009
    Date of Patent: February 24, 2015
    Assignee: Orange
    Inventors: Balazs Kovesi, Stéphane Ragot, Alain Le Guyader
  • Patent number: 8959025
    Abstract: Methods and systems for extracting speech from such packet streams. The methods and systems analyze the encoded speech in a given packet stream, and automatically identify the actual speech coding scheme that was used to produce it. These techniques may be used, for example, in interception systems where the identity of the actual speech coding scheme is sometimes unavailable or inaccessible. For instance, the identity of the actual speech coding scheme may be sent in a separate signaling stream that is not intercepted. As another example, the identity of the actual speech coding scheme may be sent in the same packet stream as the encoded speech, but in encrypted form.
    Type: Grant
    Filed: April 28, 2011
    Date of Patent: February 17, 2015
    Assignee: Verint Systems Ltd.
    Inventor: Genady Malinsky
  • Patent number: 8942975
    Abstract: Techniques are described herein that suppress noise in a Mel-filtered spectral domain. For example, a window may be applied to a representation of a speech signal in a time domain. The windowed representation in the time domain may be converted to a subsequent representation of the speech signal in the Mel-filtered spectral domain. A noise suppression operation may be performed with respect to the subsequent representation to provide noise-suppressed Mel coefficients.
    Type: Grant
    Filed: March 22, 2011
    Date of Patent: January 27, 2015
    Assignee: Broadcom Corporation
    Inventor: Jonas Borgstrom
  • Patent number: 8930183
    Abstract: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.
    Type: Grant
    Filed: August 25, 2011
    Date of Patent: January 6, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Byung Ha Chun, Mark John Francis Gales
  • Patent number: 8930184
    Abstract: A signal bandwidth extending apparatus including: a bandwidth extending section configured to extend a frequency bandwidth of a target signal, the target signal included in an input signal; a calculating section configured to calculate a degree of the target signal included in the input signal; and a controller configured to change a method of extending the frequency bandwidth by the bandwidth extending section according to a result of the calculating section.
    Type: Grant
    Filed: September 14, 2009
    Date of Patent: January 6, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Takashi Sudo, Masataka Osada
  • Patent number: 8924200
    Abstract: A method for decoding an audio signal in a decoder having a CELP-based decoder element including a fixed codebook component, at least one pitch period value, and a first decoder output, wherein a bandwidth of the audio signal extends beyond a bandwidth of the CELP-based decoder element. The method includes obtaining an up-sampled fixed codebook signal by up-sampling the fixed codebook component to a higher sample rate, obtaining an up-sampled excitation signal based on the up-sampled fixed codebook signal and an up-sampled pitch period value, and obtaining a composite output signal based on the up-sampled excitation signal and an output signal of the CELP-based decoder element, wherein the composite output signal includes a bandwidth portion that extends beyond a bandwidth of the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: December 30, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Patent number: 8924209
    Abstract: A method is disclosed for identifying a spoken command by detecting intervals of voiced and unvoiced sound, and then comparing the order of voiced and unvoiced sounds to a set of templates. Each template represents one of the predetermined acceptable commands of the application, and is associated with a predetermined action. When the order of voiced and unvoiced intervals in the spoken command matches the order in one of the templates, the associated action is thus selected. Silent intervals in the command may also be included for enhanced recognition. Efficient protocols are disclosed for discriminating voiced and unvoiced sounds, and for detecting the beginning and ending of each sound interval in the command, and for comparing the command sequence to the templates. In a sparse-command application, this method provides fast and robust recognition, and can be implemented with low-cost hardware and extremely minimal software.
    Type: Grant
    Filed: September 12, 2012
    Date of Patent: December 30, 2014
    Assignee: Zanavox
    Inventor: David Edward Newman
  • Patent number: 8924204
    Abstract: Unlike sound based pressure waves that go everywhere, air turbulence caused by wind is usually a fairly local event. Therefore, in a system that utilizes two or more spatially separated microphones to pick up sound signals (e.g., speech), wind noise picked up by one of the microphones often will not be picked up (or at least not to the same extent) by the other microphone(s). Embodiments of methods and apparatuses that utilize this fact and others to effectively detect and suppress wind noise using multiple microphones that are spatially separated are described.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: December 30, 2014
    Assignee: Broadcom Corporation
    Inventors: Juin-Hwey Chen, Jes Thyssen, Xianxian Zhang, Huaiyu Zeng
  • Patent number: 8909519
    Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: March 10, 2014
    Date of Patent: December 9, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 8898058
    Abstract: Systems, methods, apparatus, and machine-readable media for voice activity detection in a single-channel or multichannel audio signal are disclosed.
    Type: Grant
    Filed: October 24, 2011
    Date of Patent: November 25, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Jongwon Shin, Erik Visser, Ian Ernan Liu
  • Patent number: 8886527
    Abstract: A purpose is to suppress recognition process delay generated due to load in signal processing. Included is a speech input means 10 that inputs a speech signal, an output evaluation means 20 that evaluates whether or not the speech signal input by the speech input means 10 is the speech signal in a sound section, which is a speech section assuming that a speaker is speaking, and outputs the speech signal as a speech signal to be processed only when evaluated as the speech signal in the sound section, a signal processing means 30 that performs signal processing to the speech signal, which is output by the output evaluation means 20 as the speech signal to be processed, and a speech recognition processing means 40 that performs a speech recognition process to the speech signal which is signal-processed by the signal processing means 30.
    Type: Grant
    Filed: April 16, 2009
    Date of Patent: November 11, 2014
    Assignee: NEC Corporation
    Inventor: Toru Iwasawa
  • Patent number: 8879762
    Abstract: A method and apparatus to evaluate a quality of an audio signal, in which the number of effective channels is determined for each of a reference signal of a current frame and a test signal indicative of the reference signal that has passed through an audio codec, and an audio quality evaluation score of the current frame is calculated by evaluating an audio quality of the current frame based on the determined number of effective channels for each of the reference signal and the test signal by means of a predetermined evaluator.
    Type: Grant
    Filed: January 28, 2010
    Date of Patent: November 4, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: In-Yong Choi
  • Patent number: 8880409
    Abstract: A system provided herein may perform automatic temporal alignment between music audio signal and lyrics with higher accuracy than ever. A non-fricative section extracting 4 extracts non-fricative sound sections, where no fricative sounds exist, from the music audio signal. An alignment portion 17 includes a phone model 15 for singing voice capable of estimating phonemes corresponding to temporal-alignment features. The alignment portion 17 performs an alignment operation using as inputs temporal-alignment features obtained from a temporal-alignment feature extracting portion 11, information on vocal and non-vocal sections obtained from a vocal section estimating portion 9, and a phoneme network SN on conditions that no phonemes exist at least in non-vocal sections and that no fricative phonemes exist in non-fricative sound sections.
    Type: Grant
    Filed: February 5, 2009
    Date of Patent: November 4, 2014
    Assignee: National Institute of Advanced Industrial Science and Technology
    Inventors: Hiromasa Fujihara, Masataka Goto
  • Patent number: 8868432
    Abstract: A method for decoding an audio signal having a bandwidth that extends beyond a bandwidth of a CELP excitation signal in an audio decoder including a CELP-based decoder element. The method includes obtaining a second excitation signal having an audio bandwidth extending beyond the audio bandwidth of the CELP excitation signal, obtaining a set of signals by filtering the second excitation signal with a set of bandpass filters, scaling the set of signals using a set of energy-based parameters, and obtaining a composite output signal by combining the scaled set of signals with a signal based on the audio signal decoded by the CELP-based decoder element.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: October 21, 2014
    Assignee: Motorola Mobility LLC
    Inventors: Jonathan A. Gibbs, James P. Ashley, Udar Mittal
  • Patent number: 8862465
    Abstract: An electronic device for determining a set of pitch cycle energy parameters is described. The electronic device includes a processor and executable instructions stored in memory. The electronic device obtains a frame, a set of filter coefficients and a residual signal based on the frame and the set of filter coefficients. The electronic device determines a set of peak locations based on the residual signal and segments the residual signal such that each segment includes one peak. The electronic device determines a first set of pitch cycle energy parameters based on a frame region between two consecutive peak locations and maps regions between peaks in the residual signal to regions between peaks in a synthesized excitation signal to produce a mapping. The electronic device determines a second set of pitch cycle energy parameters based on the first set of pitch cycle energy parameters and the mapping.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: October 14, 2014
    Assignee: QUALCOMM Incorporated
    Inventors: Venkatesh Krishnan, Stephane Pierre Villette
  • Patent number: 8842534
    Abstract: According to one embodiment of the invention, a method for managing time-sensitive packetized data streams at a receiver includes receiving a time-sensitive packet of a data stream, analyzing an energy level of a payload signal of the packet, and determining whether to drop the packet based on the energy level of the payload signal.
    Type: Grant
    Filed: January 23, 2012
    Date of Patent: September 23, 2014
    Assignee: Cisco Technology, Inc.
    Inventors: Paul S. Hahn, Michael E. Knappe, Richard A. Dunlap, Luke K. Surazski
  • Patent number: 8825478
    Abstract: Audio content is converted to text using speech recognition software. The text is then associated with a distinct voice or a generic placeholder label if no distinction can be made. From the text and voice information, a word cloud is generated based on key words and key speakers. A visualization of the cloud displays as it is being created. Words grow in size in relation to their dominance. When it is determined that the predominant words or speakers have changed, the word cloud is complete. That word cloud continues to be displayed statically and a new word cloud display begins based upon a new set of predominant words or a new predominant speaker or set of speakers. This process may continue until the meeting is concluded. At the end of the meeting, the completed visualization may be saved to a storage device, sent to selected individuals, removed, or any combination of the preceding.
    Type: Grant
    Filed: January 10, 2011
    Date of Patent: September 2, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Susan Marie Cox, Janani Janakiraman, Fang Lu, Loulwa F Salem
  • Patent number: 8792777
    Abstract: The present invention is directed to system(s), method(s), and apparatus for accurate fast forward rate when performing trick play with variable distance between frames. In one embodiment, there is presented a circuit for providing a fast forward video sequence. The circuit comprises a system time clock for providing a time reference, said time reference incremented at a predetermined fast forward rate; a comparator for comparing the time reference with timing information associated with a picture; and a controller for determining whether to display the picture based at least in part on the comparison between the timing information and the time reference.
    Type: Grant
    Filed: January 10, 2007
    Date of Patent: July 29, 2014
    Assignee: Broadcom Corporation
    Inventor: Tim Ross
  • Patent number: 8781821
    Abstract: A method is disclosed for controlling a voice-activated device by interpreting a spoken command as a series of voiced and non-voiced intervals. A responsive action is then performed according to the number of voiced intervals in the command. The method is well-suited to applications having a small number of specific voice-activated response functions. Applications using the inventive method offer numerous advantages over traditional speech recognition systems including speaker universality, language independence, no training or calibration needed, implementation with simple microcontrollers, and extremely low cost. For time-critical applications such as pulsers and measurement devices, where fast reaction is crucial to catch a transient event, the method provides near-instantaneous command response, yet versatile voice control.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: July 15, 2014
    Assignee: Zanavox
    Inventor: David Edward Newman
  • Patent number: 8775168
    Abstract: A Yule-Walker based, low-complexity voice activity detector (VAD) is disclosed. An input signal is typically noisy speech (i.e., corrupted with, for example, babble noise). In one embodiment, a first initialization stage of the VAD computes an occurrence of a silent period within the input signal and the AR parameters. The VAD could accordingly compute a tentative adaptive threshold and output hypothesis H1 (which means speech is present) during this stage. During the second initialization stage, the VAD generally builds a database of associated values and computes the adaptive threshold accordingly. The second initialization stage could also output tentative VAD decisions based on the tentative threshold computed in the first initialization stage. Finally, the VAD periodically retrains or updates AR parameters, threshold values and/or the database and outputs VAD decisions accordingly.
    Type: Grant
    Filed: August 3, 2007
    Date of Patent: July 8, 2014
    Assignee: STMicroelectronics Asia Pacific PTE, Ltd.
    Inventors: Karthik Muralidhar, Anoop Kumar Krishna
  • Patent number: 8762158
    Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.
    Type: Grant
    Filed: August 5, 2011
    Date of Patent: June 24, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyun-wook Kim, Han-gil Moon, Sang-hoon Lee
  • Patent number: 8762139
    Abstract: A noise suppression device includes: a power spectrum calculator converting an input signal of time domain into power spectra of frequency domain; a voice/noise determination unit determining whether the power spectra indicate voice or noise; a noise spectrum estimation unit estimating noise spectra of the power spectra; a period component estimation unit analyzing a harmonic structure constituting the power spectra and estimating periodical information about the power spectra; a weighting coefficient calculator calculating a weighting coefficient for weighting the power spectra; a suppression coefficient calculator calculating a suppression coefficient for suppressing noise included in the power spectra; a spectrum suppression unit suppressing amplitude of the power spectra in accordance with the suppression coefficient; and an inverse Fourier transformer converting the power spectra output by the spectrum suppression unit into a signal of time domain to generate a noise-suppressed signal.
    Type: Grant
    Filed: September 21, 2010
    Date of Patent: June 24, 2014
    Assignee: Mitsubishi Electric Corporation
    Inventors: Satoru Furuta, Hirohisa Tasaki
  • Patent number: 8762144
    Abstract: A method and apparatus for detecting voice activity are disclosed. The method of detecting voice activity includes: extracting a feature parameter from a frame signal; determining whether the frame signal is a voice signal or a noise signal by comparing the feature parameter with model parameters of a plurality of comparison signals, respectively; and outputting the frame signal when the frame signal is determined to be a voice signal. The apparatus includes a classifier module which extracts a feature parameter from a frame signal, and generating labeling information with respect to the frame signal by comparing the feature parameter with model parameters of a plurality of comparison signals; and a voice detection unit which determines whether the frame signal is a noise signal or a voice signal with reference to the labeling information, and outputting the frame signal when the frame signal is determined to be a voice signal.
    Type: Grant
    Filed: May 3, 2011
    Date of Patent: June 24, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Nam-gook Cho, Eun-kyoung Kim
  • Patent number: 8751222
    Abstract: Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage.
    Type: Grant
    Filed: May 22, 2009
    Date of Patent: June 10, 2014
    Assignee: Accenture Global Services Limited Dublin
    Inventors: Thomas J. Ryan, Biji K. Janan
  • Patent number: 8744839
    Abstract: Target word recognition includes: obtaining a candidate word set and corresponding characteristic computation data, the candidate word set comprising text data, and characteristic computation data being associated with the candidate word set; performing segmentation of the characteristic computation data to generate a plurality of text segments; combining the plurality of text segments to form a text data combination set; determining an intersection of the candidate word set and the text data combination set, the intersection comprising a plurality of text data combinations; determining a plurality of designated characteristic values for the plurality of text data combinations; based at least in part on the plurality of designated characteristic values and according to at least a criterion, recognizing among the plurality of text data combinations target words whose characteristic values fulfill the criterion.
    Type: Grant
    Filed: September 22, 2011
    Date of Patent: June 3, 2014
    Assignee: Alibaba Group Holding Limited
    Inventors: Haibo Sun, Yang Yang, Yining Chen
  • Patent number: 8738367
    Abstract: A speech signal processing device is equipped with a power acquisition unit, a probability distribution acquisition unit, and a correspondence degree determination unit. The power acquisition unit accepts an inputted speech signal and, based on the accepted speech signal, acquires power representing the intensity of a speech sound represented by the speech signal. The probability distribution acquisition unit acquires a probability distribution using the intensity of the power acquired by the power acquisition unit as a random variable. The correspondence degree determination unit determines whether a correspondence degree representing a degree that power acquired by the power acquisition unit in a case that a predetermined reference speech signal is inputted into the power acquisition unit corresponds with predetermined reference power is higher than a predetermined reference correspondence degree, based on the probability distribution acquired by the probability distribution acquisition unit.
    Type: Grant
    Filed: February 18, 2010
    Date of Patent: May 27, 2014
    Assignee: NEC Corporation
    Inventor: Tadashi Emori
  • Patent number: 8731914
    Abstract: A system and method for locating a preferable playback start location after a winding or rewinding action in an audio playing device. In response to an adjustment of the playing location for audio content to a desired playing position, the system determines whether at least one non-speech or silent period of at least a predetermined duration exists within the vicinity of the desired playing position. If at least one such non-speech or silent period exists within the vicinity of the desired playing position, the system adjusts the playing position to fall within one of the at least one non-speech period or silent period.
    Type: Grant
    Filed: November 15, 2005
    Date of Patent: May 20, 2014
    Assignee: Nokia Corporation
    Inventors: Janne Vainio, Hannu J. Mikkola, Jari M. Makinen
  • Patent number: 8731209
    Abstract: In order to generate a multi-channel signal having a number of output channels greater than a number of input channels, a mixer is used for upmixing the input signal to form at least a direct channel signal and at least an ambience channel signal. A speech detector is provided for detecting a section of the input signal, the direct channel signal or the ambience channel signal in which speech portions occur. Based on this detection, a signal modifier modifies the input signal or the ambience channel signal in order to attenuate speech portions in the ambience channel signal, whereas such speech portions in the direct channel signal are attenuated to a lesser extent or not at all. A loudspeaker signal outputter then maps the direct channel signals and the ambience channel signals to loudspeaker signals which are associated to a defined reproduction scheme, such as, for example, a 5.1 scheme.
    Type: Grant
    Filed: October 1, 2008
    Date of Patent: May 20, 2014
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Christian Uhle, Oliver Hellmuth, Juergen Herre, Harald Popp, Thorsten Kastner
  • Patent number: 8731207
    Abstract: An embodiment of an apparatus for computing control information for a suppression filter for filtering a second audio signal to suppress an echo based on a first audio signal includes a computer having a value determiner for determining at least one energy-related value for a band-pass signal of at least two temporally successive data blocks of at least one signal of a group of signals. The computer further includes a mean value determiner for determining at least one mean value of the at least one determined energy-related value for the band-pass signal. The computer further includes a modifier for modifying the at least one energy-related value for the band-pass signal on the basis of the determined mean value for the band-pass signal. The computer further includes a control information computer for computing the control information for the suppression filter on the basis of the at least one modified energy-related value.
    Type: Grant
    Filed: January 12, 2009
    Date of Patent: May 20, 2014
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.
    Inventors: Fabian Kuech, Markus Kallinger, Christof Faller, Alexis Favrot
  • Patent number: 8725508
    Abstract: A computer-implemented method and apparatus for searching for an element sequence, the method comprising: receiving a signal; determining an initial segment of the signal; inputting the initial segment into an element extraction engine to obtain a first element sequence; determining one or more second segments, each of the second segments at least partially overlapping with the initial segment; inputting the second segments into the element extraction engine to obtain at least one second element sequence; and searching for an element subsequence common to at least a predetermined number of sequences of the first element sequence and the second element sequences.
    Type: Grant
    Filed: March 27, 2012
    Date of Patent: May 13, 2014
    Assignee: Novospeech
    Inventor: Yossef Ben-Ezra
  • Patent number: 8725498
    Abstract: A computer-implemented method for digital speech processing, including (1) receiving, at a server computer, digital speech data from a computing device, the digital speech data comprising data points sampled at respective time points; (2) computing, by the server computer, a tonal feature of the digital speech data, the tonal feature comprising information encoding fundamental frequencies at the respective time points; (3) computing, by the server computer, a logarithm of the tonal feature at the respective time points; and (4) processing, by the server computer, the logarithm of the tonal feature based on a characterization of the digital speech data at the respective time points.
    Type: Grant
    Filed: July 24, 2012
    Date of Patent: May 13, 2014
    Assignee: Google Inc.
    Inventors: Yun-hsuan Sung, Meihong Wang, Xin Lei
  • Patent number: 8700390
    Abstract: A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: October 7, 2013
    Date of Patent: April 15, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 8694308
    Abstract: A system for voice detection includes a feature value calculation unit that calculates a feature value from an input signal sliced on a per frame basis, a provisional voice/non-voice decision unit that provisionally decides a voiced interval and a non-voiced interval from the feature value calculated on a per frame basis, and a voice/non-voice decision unit that determines a voiced interval duration threshold value or a non-voiced interval duration threshold value, using a ratio of the feature value found on a per frame basis to a threshold value for the feature value and that re-decides the voiced interval and the non-voiced interval, using the voiced interval duration threshold value determined and the non-voiced interval duration threshold value determined.
    Type: Grant
    Filed: November 26, 2008
    Date of Patent: April 8, 2014
    Assignee: Nec Corporation
    Inventors: Takayuki Arakawa, Masanori Tsujikawa
  • Patent number: 8694326
    Abstract: A communication terminal includes a decoder which decodes an input bitstream received from another communication terminal, to generate an output audio signal and outputs the generated output audio signal to a speaker; and an echo canceller which obtains an input audio signal representing sound captured by a microphone placed in a space to which the speaker outputs the sound, and removes, for respective subbands, an echo component included in the obtained input audio signal and corresponding to the output audio signal, to generate an audio signal for transmission. An encoder codes the audio signal for transmission to generate an output bitstream and transmits the generated output bitstream to another communication terminal; and a control unit controls, for the respective subbands, echo cancellation processing according to a reproduction band of at least one of the output audio signal and the audio signal for transmission.
    Type: Grant
    Filed: August 21, 2012
    Date of Patent: April 8, 2014
    Assignee: Panasonic Corporation
    Inventors: Shuji Miyasaka, Kosuke Nishio, Ichiro Kawashima
  • Patent number: 8687831
    Abstract: An external device for a hearing implant system and a hearing implant system having an external device is described. An external transmitter generates a radio-frequency inductive link signal to an implanted receiver including a sequence of data word segments which communicate data to the implanted receiver, and a sequence of data word pause segments between each data word segment which communicate energy without data to the implanted receiver. A data word pause controller controls the inductive link signal during the data word pause segments according to an energy management rule.
    Type: Grant
    Filed: October 26, 2012
    Date of Patent: April 1, 2014
    Assignee: Med-El Elektromedizinische Geraete GmbH
    Inventors: Martin Stoffaneller, Peter Schleich, Thomas Schwarzenbeck
  • Patent number: 8688438
    Abstract: A speech processing system includes a plurality of signal analyzers that extract salient signal attributes of an input voice signal. A difference module computes the differences in the salient signal attributes. One or more control modules control a plurality of speech generators using an output signal from the difference module in a speech-locked loop (SLL), the speech generators use the output signal to generate a voice signal.
    Type: Grant
    Filed: February 9, 2010
    Date of Patent: April 1, 2014
    Assignee: Massachusetts Institute of Technology
    Inventors: Keng Hoong Wee, Lorenzo Turicchia, Rahul Sarpeshkar
  • Publication number: 20140088958
    Abstract: The present invention is a method and system to convert speech signal into a parametric representation in terms of timbre vectors, and to recover the speech signal thereof. The speech signal is first segmented into non-overlapping frames using the glottal closure instant information, each frame is converted into an amplitude spectrum using a Fourier analyzer, and then using Laguerre functions to generate a set of coefficients which constitute a timbre vector. A sequence of timbre vectors can be subject to a variety of manipulations. The new timbre vectors are converted back into voice signals by first transforming into amplitude spectra using Laguerre functions, then generating phase spectra from the amplitude spectra using Kramers-Knonig relations. A Fourier transformer converts the amplitude spectra and phase spectra into elementary acoustic waves, then superposed to become the output voice. The method and system can be used for voice transformation, speech synthesis, and automatic speech recognition.
    Type: Application
    Filed: December 3, 2012
    Publication date: March 27, 2014
    Inventor: Chengjun Julian Chen