Voiced Or Unvoiced Patents (Class 704/214)
  • Patent number: 8407044
    Abstract: A method for discriminating a telephony content signal into a first category or a second category is described. The method comprises a filtering procedure for obtaining from the telephony content signal a band signal set comprising one or more band signals, each band signal being associated with a respective frequency band at least one of said band signals being a sub-band signal (n) associated with a sub-band of an overall frequency band of the telephony content signal. Furthermore a determination procedure is provided for determining a band signal variation value (LLn) and a band signal strength value (TLn) for each band signal (n) of said band signal set. Finally, a discrimination procedure discriminates whether the telephony content signal is of the first category or of the second category.
    Type: Grant
    Filed: October 30, 2008
    Date of Patent: March 26, 2013
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventor: Arto Juhani Mahkonen
  • Patent number: 8392179
    Abstract: The invention relates to the coding of audio signals that may include both speech-like and non-speech-like signal components. It describes methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters, a plurality of codebooks each having codevectors, at least one codebook providing an excitation more appropriate for non-speech-like signals and at least one codebook providing an excitation more appropriate for speech-like signals, and a plurality of gain factors, each associated with a codebook. The encoding methods and apparatus select from the codebooks codevectors and/or associated gain factors by minimizing a measure of the difference between the audio signal and a reconstruction of the audio signal derived from the codebook excitations. The decoding methods and apparatus generate a reconstructed output signal from the LPC parameters, codevectors, and gain factors.
    Type: Grant
    Filed: March 12, 2009
    Date of Patent: March 5, 2013
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Rongshan Yu, Regunathan Radhakrishnan, Robert Andersen, Grant Davidson
  • Patent number: 8374852
    Abstract: Disclosed is a code conversion method to convert a first code sequence conforming to a first speech coding scheme into a second code sequence conforming to a second speech coding scheme. The method includes the following steps. The first step discriminates whether the first code sequence corresponds to a speech part or to a non-speech part, and generates a numerical value that indicates the discrimination result as a control flag. The second step converts the first code sequence into the second code sequence and outputs said second code sequence, when the value of the control flag corresponds to the speech part. The third step outputs the second code sequence that corresponds to the value of the control flag, when the value of the control flag corresponds to the non-speech part.
    Type: Grant
    Filed: March 16, 2006
    Date of Patent: February 12, 2013
    Assignee: NEC Corporation
    Inventor: Atsushi Murashima
  • Patent number: 8370138
    Abstract: A scalable encoding device is capable of improving quality of a decoded signal without increasing an encoding amount and compensating data with a sufficient quality upon data loss. An extension layer bit distribution calculator calculates a bit distribution of a quality improving encoding data and compensation encoding data in the extension layer according to an audio mode of the input signal. An extension layer encoder generates quality improving encoding data according to the specified number of bits. A compensation information encoder extracts a part of core layer encoding data and makes it as compensation encoding data for the core layer. An extension layer encoded data generator multiplexes the extension layer bit distribution information, the compensation encoding data, and the quality improving encoding data so as to obtain extension layer encoding data.
    Type: Grant
    Filed: March 15, 2007
    Date of Patent: February 5, 2013
    Assignee: Panasonic Corporation
    Inventors: Takuya Kawashima, Hiroyuki Ehara, Koji Yoshida
  • Patent number: 8370132
    Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.
    Type: Grant
    Filed: November 21, 2005
    Date of Patent: February 5, 2013
    Assignee: Verizon Services Corp.
    Inventor: Adrian E. Conway
  • Patent number: 8364492
    Abstract: Disclosed is an apparatus including an unvoiced speech input device, a decision unit and an alarm unit. The unvoiced speech input device receives the unvoiced speech, and the decision unit determines whether or not a signal received from the unvoiced speech input device is an ordinary speech. The alarm unit receives a result of the decision from the decision unit to give an alarm when the result of decision indicates the ordinary speech. The alarm is given to a wearer of the apparatus if he/she has made ordinary speech.
    Type: Grant
    Filed: July 6, 2007
    Date of Patent: January 29, 2013
    Assignee: NEC Corporation
    Inventor: Reishi Kondou
  • Patent number: 8346543
    Abstract: A VAD/SS system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
    Type: Grant
    Filed: March 17, 2011
    Date of Patent: January 1, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Bing Chen, James H. James
  • Patent number: 8326612
    Abstract: A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value.
    Type: Grant
    Filed: April 5, 2010
    Date of Patent: December 4, 2012
    Assignee: Fujitsu Limited
    Inventors: Nobuyuki Washio, Shoji Hayakawa
  • Patent number: 8326613
    Abstract: The present invention relates to a method of synthesizing a signal comprising the steps of determining a required pitch bell locations, mapping the required pitch bell locations onto the signal to provide first pitch bell locations, randomizing the first pitch bell locations to provide second pitch bell locations, windowing the signal on the second pitch bell locations to provide a pitch bell, repeating the aforementioned steps for all required pitch bell locations and performing an overlap and add operation with respect to the pitch bells in order to synthesize the signal.
    Type: Grant
    Filed: August 25, 2010
    Date of Patent: December 4, 2012
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Ercan Ferit Gigi
  • Patent number: 8326611
    Abstract: Acoustic Voice Activity Detection (AVAD) methods and systems are described. The AVAD methods and systems, including corresponding algorithms or programs, use microphones to generate virtual directional microphones which have very similar noise responses and very dissimilar speech responses. The ratio of the energies of the virtual microphones is then calculated over a given window size and the ratio can then be used with a variety of methods to generate a VAD signal. The virtual microphones can be constructed using either an adaptive or a fixed filter.
    Type: Grant
    Filed: October 26, 2009
    Date of Patent: December 4, 2012
    Assignee: AliphCom, Inc.
    Inventors: Nicolas Petit, Gregory Burnett, Zhinian Jing
  • Patent number: 8321213
    Abstract: Acoustic Voice Activity Detection (AVAD) methods and systems are described. The AVAD methods and systems, including corresponding algorithms or programs, use microphones to generate virtual directional microphones which have very similar noise responses and very dissimilar speech responses. The ratio of the energies of the virtual microphones is then calculated over a given window size and the ratio can then be used with a variety of methods to generate a VAD signal. The virtual microphones can be constructed using either an adaptive or a fixed filter.
    Type: Grant
    Filed: October 26, 2009
    Date of Patent: November 27, 2012
    Assignee: AliphCom, Inc.
    Inventors: Nicolas Petit, Gregory Burnett, Zhinian Jing
  • Patent number: 8315862
    Abstract: An audio signal quality enhancement apparatus and method. The apparatus includes a pitch calculating unit to extract a pitch period of an audio signal, a frequency domain transforming unit to transform the audio signal to a frequency domain, a frequency band dividing unit to classify the transformed audio signal into audio signals for each of the plurality of frequency bands based on the extracted pitch period, and a pitch enhancement unit to determine a gain based on a volume of the transformed audio signal, and to generate an output signal by multiplying each of the classified audio signals with respect to each of the plurality of frequency bands by the gain, thereby enhancing quality of the audio signal.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: November 20, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jung Hoe Kim, Ho Chong Park, Eun Mi Oh
  • Patent number: 8315865
    Abstract: A conversation detector and detection method is based on voice band energy detection. The detector is formed of a signal preconditioner, a comparator and an analysis unit. The comparator generates signal pulses reduced in resolution and sample rate (e.g., single bit data) and indicative of energy level and/or duration of activity detected in subject audio signals. The analysis unit determines from the generated signal pulses whether a conversation exists in the subject audio signal. The detector is also able to adapt to environmental noise change, automatically calibrate and operate in low power consumption mode.
    Type: Grant
    Filed: May 4, 2004
    Date of Patent: November 20, 2012
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Benjamin Kuris
  • Patent number: 8315858
    Abstract: A method for digitally recording an analog audio signal with automatic indexing, having the following steps: (a) an analog audio signal containing audio information and signal pauses is read in, (b) the analog audio signal is converted into digital audio data comprising audio information data and signal pause duration data, (c) the audio information data are stored as information data blocks and the signal pause duration data are stored as signal pause data blocks in a memory, (d) the stored data blocks are read sequentially and an index table is produced, any succession of information data blocks which is not interrupted by a signal pause with a predetermined duration being detected as one cohesive audio information data sequence whose start and end are stored in the index table.
    Type: Grant
    Filed: July 13, 2000
    Date of Patent: November 20, 2012
    Inventors: Christian Legl, Michael Hermann
  • Patent number: 8311814
    Abstract: The present invention is directed to a voice activity detector that uses the periodicity of amplitude peaks and valleys to identify signals of substantially fixed power or having periodicity.
    Type: Grant
    Filed: September 19, 2006
    Date of Patent: November 13, 2012
    Assignee: Avaya Inc.
    Inventors: Mei-Sing Ong, Luke A. Tucker
  • Patent number: 8311813
    Abstract: Discrimination between at least two classes of events in an input signal is carried out in the following way. A set of frames containing an input signal is received, and at least two different feature vectors are determined for each of said frames. Said at least two different feature vectors are classified using respective sets of preclassifiers trained for said at least two classes of events. Values for at least one weighting factor are determined based on outputs of said preclassifiers for each of said frames. A combined feature vector is calculated for each of said frames by applying said at least one weighting factor to said at least two different feature vectors. Said combined feature vector is classified using a set of classifiers trained for said at least two classes of events.
    Type: Grant
    Filed: October 26, 2007
    Date of Patent: November 13, 2012
    Assignee: International Business Machines Corporation
    Inventor: Zica Valsan
  • Patent number: 8306823
    Abstract: A speech receiving unit receives a user ID, a speech obtained at a terminal, and an utterance duration, from the terminal. A proximity determining unit calculates a correlation value expressing a correlation between speeches received from plural terminals, compares the correlation value with a first threshold value, and determines that the plural terminals that receive the speeches whose correlation value is calculated are close to each other, when the correlation value is larger than the first threshold value. A dialog detecting unit determines whether a relationship between the utterance durations received from the plural terminals that are determined to be close to each other within an arbitrarily target period during which a dialog is to be detected fits a rule. When the relationship is determined to fit the rule, the dialog detecting unit detects dialog information containing the target period and the user ID.
    Type: Grant
    Filed: March 11, 2008
    Date of Patent: November 6, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masayuki Okamoto, Naoki Iketani, Hideo Umeki, Sogo Tsuboi, Kenta Cho, Keisuke Nishimura, Masanori Hattori
  • Patent number: 8296136
    Abstract: A system improves the speech intelligibility and the speech quality of a speech segment. The system includes a dynamic controller that detects a background noise from an input by modeling a signal. A variable gain amplifier adjusts the variable gain of the amplifier in response to an output of dynamic controller. A shaping filter adjusts a speech signal by tilting portions of the speech signal of the dynamic controller.
    Type: Grant
    Filed: November 15, 2007
    Date of Patent: October 23, 2012
    Assignee: QNX Software Systems Limited
    Inventor: Rajeev Nongpiur
  • Patent number: 8296154
    Abstract: A sound processor including a microphone (1), a pre-amplifier (2), a bank of N parallel filters (3), means for detecting short-duration transitions in the envelope signal of each filter channel, and means for applying gain to the outputs of these filter channels in which the gain is related to a function of the second-order derivative of the slow-varying envelope signal in each filter channel, to assist in perception of low-intensity sort-duration speech features in said signal.
    Type: Grant
    Filed: October 28, 2008
    Date of Patent: October 23, 2012
    Assignee: Hearworks Pty Limited
    Inventors: Andrew E. Vandali, Graeme M. Clark
  • Publication number: 20120253796
    Abstract: A sound is picked up by a microphone. A speech waveform signal is generated based on the picked up sound. A speech segment or a non-speech segment is detected based on the speech waveform signal. The speech segment corresponds to a voice input period during which a voice is input. The non-speech segment corresponds to a non-voice input period during which no voice is input. A determination signal is generated that indicates whether the picked up sound is the speech segment or the non-speech segment. A detected state of the speech segment is indicated based on the determination signal.
    Type: Application
    Filed: March 29, 2012
    Publication date: October 4, 2012
    Applicant: JVC KENWOOD Corporation a corporation of Japan
    Inventor: Taichi MAJIMA
  • Patent number: 8280731
    Abstract: A speech enhancement method operative for devices having limited available memory is described. The method is appropriate for very noisy environments and is capable of estimating the relative strengths of speech and noise components during both the presence as well as the absence of speech.
    Type: Grant
    Filed: March 14, 2008
    Date of Patent: October 2, 2012
    Assignee: Dolby Laboratories Licensing Corporation
    Inventor: Rongshan Yu
  • Patent number: 8275609
    Abstract: A voice activity detection (VAD) device and method provide for a VAD threshold that is adaptive to background noise variation.
    Type: Grant
    Filed: December 4, 2009
    Date of Patent: September 25, 2012
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Zhe Wang
  • Patent number: 8254617
    Abstract: Microphone arrays (MAs) are described that position and vent microphones so that performance of a noise suppression system coupled to the microphone array is enhanced. The MA includes at least two physical microphones to receive acoustic signals. The physical microphones make use of a common rear vent (actual or virtual) that samples a common pressure source. The MA includes a physical directional microphone configuration and a virtual directional microphone configuration. By making the input to the rear vents of the microphones (actual or virtual) as similar as possible, the real-world filter to be modeled becomes much simpler to model using an adaptive filter.
    Type: Grant
    Filed: June 27, 2008
    Date of Patent: August 28, 2012
    Assignee: AliphCom, Inc.
    Inventor: Gregory C. Burnett
  • Patent number: 8244528
    Abstract: In accordance with an example embodiment of the invention, there is provided an apparatus for detecting voice activity in an audio signal. The apparatus comprises a first voice activity detector for making a first voice activity detection decision based at least in part on the voice activity of a first audio signal received from a first microphone. The apparatus also comprises a second voice activity detector for making a second voice activity detection decision based at least in part on an estimate of a direction of the first audio signal and an estimate of a direction of a second audio signal received from a second microphone. The apparatus further comprises a classifier for making a third voice activity detection decision based at least in part on the first and second voice activity detection decisions.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: August 14, 2012
    Assignee: Nokia Corporation
    Inventors: Riitta Elina Niemistö, Päivi Marianna Valve
  • Patent number: 8219391
    Abstract: Presented herein are systems and methods for processing sound signals for use with electronic speech systems. Sound signals are temporally parsed into frames, and the speech system includes a speech codebook having entries corresponding to frame sequences. The system identifies speech sounds in an audio signal using the speech codebook.
    Type: Grant
    Filed: November 6, 2006
    Date of Patent: July 10, 2012
    Assignee: Raytheon BBN Technologies Corp.
    Inventors: Robert David Preuss, Darren Ross Fabbri, Daniel Ramsay Cruthirds
  • Patent number: 8219390
    Abstract: A system and method are disclosed for modifying an audio signal. A pitch associated with the audio signal is detected. A portion of the audio signal that is associated with the detected pitch is modified. Controlling the modification of a primary audio signal is disclosed. The level of a secondary audio signal is monitored. Modification of the primary audio signal is enabled if the level of the secondary audio signal rises above a first prescribed threshold at a time when the primary audio signal is not being modified. Modification of the primary audio signal is disabled if the level of the secondary audio signal drops below a second prescribed threshold at a time when the primary audio signal is being modified.
    Type: Grant
    Filed: September 16, 2003
    Date of Patent: July 10, 2012
    Assignee: Creative Technology Ltd
    Inventor: Jean Laroche
  • Patent number: 8214222
    Abstract: A method for identifying a frame type is disclosed. The present invention includes receiving current frame type information, obtaining previously received previous frame type information, generating frame identification information of a current frame using the current frame type information and the previous frame type information, and identifying the current frame using the frame identification information. And, a method for identifying a frame type is disclosed. The present invention includes receiving a backward type bit corresponding to current frame type information, obtaining a forward type bit corresponding to previous frame type information, generating frame identification information of a current frame by placing the backward type bit at a first position and placing the forward type bit at a second position.
    Type: Grant
    Filed: May 8, 2009
    Date of Patent: July 3, 2012
    Assignee: LG Electronics Inc.
    Inventors: Sang Bae Chon, Lae Hoon Kim, Koeng Mo Sung
  • Patent number: 8214200
    Abstract: Methods and apparatus are disclosed for approximating an MDCT coefficient of a block of windowed sinusoid having a defined frequency, the block being multiplied by a window sequence and having a block length and a block index. A finite trigonometric series is employed to approximate the window sequence. A window summation table is pre-computed using the finite trigonometric series and the defined frequency of the sinusoid. A block phase is computed for each block with the defined frequency, the block length and the block index. An MDCT coefficient is approximated by the dot product of a phase vector computed using the block phase with a corresponding row of the window summation table.
    Type: Grant
    Filed: March 14, 2007
    Date of Patent: July 3, 2012
    Assignee: XFRM, Inc.
    Inventors: Richard C. Cabot, Matthew S. Ashman
  • Patent number: 8209167
    Abstract: The mobile radio terminal includes a speech input unit which inputs a speech signal obtained from speech of a speaking person, an estimating unit which estimates a speech style of the speaking person from the speech signal, and a converting unit which converts the speech signal into a converted speech signal in accordance with the estimated speech style.
    Type: Grant
    Filed: September 17, 2008
    Date of Patent: June 26, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Kazunori Imoto
  • Patent number: 8204743
    Abstract: An apparatus and method for concealing frame erasure and a voice decoding apparatus and method using the same. The frame erasure concealment apparatus includes: a parameter extraction unit determining whether there is an erased frame in a voice packet, and extracting an excitement signal parameter and a line spectrum pair parameter of a previous good frame; and an erasure frame concealment unit, if there is an erased frame, restoring the excitement signal and line spectrum pair parameter of the erased frame by using a regression analysis from the excitement signal and line spectrum pair parameter of the previous good frame. According to the method and apparatus, by predicting and restoring the parameter of the erased frame through the regression analysis, the quality of the restored voice signal can be enhanced and the algorithm can be simplified.
    Type: Grant
    Filed: May 4, 2006
    Date of Patent: June 19, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hosang Sung, Kangeun Lee, Seungho Choi
  • Patent number: 8195449
    Abstract: A non-intrusive signal quality assessment apparatus includes a feature vector calculator that determines parameters representing frames of a signal and extracts a collection of per-frame feature vectors (?;(n)) representing structural information of the signal from the parameters. A frame selector preferably selects only frames (?\with a feature vector (?;(n)) lying within a predetermined multi-dimensional window (?). Means determine a global feature set (?) over the collection of feature vectors (?;(n)) from statistical moments of selected feature vector components ((1^,02, . . . O11). A quality predictor predicts a signal quality measure (Qj from the global feature set (?)).
    Type: Grant
    Filed: January 30, 2007
    Date of Patent: June 5, 2012
    Assignee: Telefonaktiebolaget L M Ericsson (Publ)
    Inventors: Stefan Bruhn, Volodya Grancharov, Willem Bastiaan Kleijn
  • Patent number: 8195451
    Abstract: In an information detecting apparatus (1), a speech kind discrimination unit (11) discriminates and classifies an audio signal at an information source into kind (category) such as music or speech, etc. on a predetermined time basis, and a memory unit/recording medium (13) records discrimination information thereof. A discrimination frequency calculating unit (15) calculates, on a predetermined time basis, discrimination frequency every kind at a predetermined time period longer than the time unit.
    Type: Grant
    Filed: February 10, 2004
    Date of Patent: June 5, 2012
    Assignee: Sony Corporation
    Inventor: Yasuhiro Toguri
  • Patent number: 8175868
    Abstract: A voice judging system including feature value extraction means that analyzes a sound signal input from a sound signal input device, and extracts a time series of the feature values, sub-word boundary score calculating means that calculates a time series of sub-word boundary scores, by having reference to sound models of voice stored in a voice model storage unit, temporal regularity analyzing means that analyzes temporal regularity of the sub-word boundary scores, and voice judgment means judges whether the input sound signal is voice or non-voice using of the temporal regularity of the sub-word boundary scores.
    Type: Grant
    Filed: October 10, 2006
    Date of Patent: May 8, 2012
    Assignee: NEC Corporation
    Inventor: Makoto Terao
  • Publication number: 20120084082
    Abstract: Encoding audio signals with selecting an encoding mode for encoding the signal categorizing the signal into active segments having voice activity and non-active segments having substantially no voice activity by using categorization parameters depending on the selected encoding mode and encoding at least the active segments using the selected encoding mode.
    Type: Application
    Filed: September 29, 2011
    Publication date: April 5, 2012
    Applicant: Nokia Corporation
    Inventors: Kari Järvinen, Pasi Ojala, Ari Lakaniemi
  • Patent number: 8145982
    Abstract: Aspects of a method and system for redundancy-based decoding of voice content in a wireless local area network (WLAN) system are provided. A WLAN receiver may determine whether a decoded portion of a received packet comprises voice content and may select a redundancy-based decoder to decode a remaining portion of the packet when voice content is detected. The redundancy-based decoder may be a Viterbi decoder. The redundancy-based decoder may be selected to decode a determined number of subsequent packets or to decode subsequent packets for a determined amount of time. After decoding the remaining portion of the packet and any subsequent packets, the WLAN receiver may select a standard Viterbi decoder to decode additional received packets. The WLAN receiver may generate at least one signal to select the redundancy-based decoder and the standard Viterbi decoder.
    Type: Grant
    Filed: January 12, 2011
    Date of Patent: March 27, 2012
    Assignee: Broadcom Corporation
    Inventors: Jeyhan Karaoguz, Hooman Honary, Nambirajan Seshadri, Jason A. Trachewski, Arie Heiman
  • Patent number: 8126705
    Abstract: A system and method for automatically adjusting floor controls for a conversation is provided. Audio streams are received, which each originate from an audio source. Floor controls for a current configuration including at least a portion of the audio streams are maintained. Conversational characteristics shared by two or more of the audio sources are determined. Possible configurations for the audio streams are identified based on the conversational characteristics. An analysis of the current configuration and the possible configurations is performed. A change threshold is applied to the analysis. When the analysis satisfies the change threshold, the floor controls are automatically adjusted. The audio streams are mixed into one or more outputs based on the adjusted floor controls.
    Type: Grant
    Filed: November 9, 2009
    Date of Patent: February 28, 2012
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Paul Masami Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison Gyle Woodruff
  • Patent number: 8108211
    Abstract: A fast accurate multi-channel frequency dependent scheme for analyzing noise in a signal processing system is described herein. Noise is decomposed within each channel into frequency bands and sub-band noise is propagated. To avoid the computational complexity of a convolution, traditional methods either assume the noise to be white, at any point in the signal processing pipeline, or they just ignore spatial operations. By assuming the noise to be white within each frequency band, it is possible to propagate any type of noise (white, colored, Gaussian, non-Gaussian and others) across a spatial transformation in a very fast and accurate manner. To demonstrate the efficacy of this technique, noise propagation is considered across various spatial operations in an image processing pipeline. Furthermore, the computational complexity is a very small fraction of the computational cost of propagating an image through a signal processing system.
    Type: Grant
    Filed: March 29, 2007
    Date of Patent: January 31, 2012
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Farhan A. Baqai, Akira Matsui, Kenichi Nishio
  • Patent number: 8102766
    Abstract: According to one embodiment of the invention, a method for managing time-sensitive packetized data streams at a receiver includes receiving a time-sensitive packet of a data stream, analyzing an energy level of a payload signal of the packet, and determining whether to drop the packet based on the energy level of the payload signal.
    Type: Grant
    Filed: November 2, 2006
    Date of Patent: January 24, 2012
    Assignee: Cisco Technology, Inc.
    Inventors: Paul S. Hahn, Michael E. Knappe, Richard A. Dunlap, Luke K. Surazski
  • Patent number: 8068619
    Abstract: A small array microphone system includes an array microphone having a plurality of microphones and operative to provide a plurality of received signals, each microphone providing one received signal. A first voice activity detector (VAD) provides a first voice detection signal generated using the plurality of received signals to indicate the presence or absence of in-beam desired speech. A second VAD provides a second voice detection signal generated using the plurality of received signals to indicate the presence or absence of out-of-beam noise when in-beam desired speech is absent. A reference signal generator provides a reference signal based on the first voice detection signal, the plurality of received signals, and a beamformed signal, wherein the reference signal has the desired speech suppressed. A beamformer provides the beamformed signal based on the second voice detection signal, the reference signal, and the plurality of received signals, wherein the beamformed signal has noise suppressed.
    Type: Grant
    Filed: January 5, 2007
    Date of Patent: November 29, 2011
    Assignee: Fortemedia, Inc.
    Inventors: Ming Zhang, Xiaoyan Lu
  • Patent number: 8069039
    Abstract: In a sound signal processing apparatus, a frame information generation section generates frame information of each frame of a sound signal. A storage stores the frame information generated by the frame information generation section. A first interval determination section determines a first utterance interval in the sound signal. A second interval determination section determines a second utterance interval based on the frame information of the first utterance interval stored in the storage such that the second utterance interval is made shorter than the first utterance interval and confined within the first utterance interval by trimming frames from either of a start point or an end point of the first utterance interval.
    Type: Grant
    Filed: December 21, 2007
    Date of Patent: November 29, 2011
    Assignee: Yamaha Corporation
    Inventor: Yasuo Yoshioka
  • Patent number: 8069034
    Abstract: A method for supporting an encoding of an audio signal is shown, wherein at least a first and a second coder mode are available for encoding a section of the audio signal. The first coder mode enables a coding based on two different coding models. A selection of a coding model is enabled by a selection rule which is based on signal characteristics which have been determined for a certain analysis window. In order to avoid a misclassification of a section after a switch to the first coder mode, it is proposed that the selection rule is activated only when sufficient sections for the analysis window have been received. The invention relates equally to a module in which this method is implemented, to a device and a system comprising such a module and to a software program product including a software code for realizing the proposed method.
    Type: Grant
    Filed: May 6, 2005
    Date of Patent: November 29, 2011
    Assignee: Nokia Corporation
    Inventors: Jari Mäkinen, Ari Lakaniemi, Pasi Ojala
  • Patent number: 8063809
    Abstract: A transient signal encoding method and device, decoding method and device, and processing system, where the transient signal encoding method includes: obtaining a reference sub-frame where a maximal time envelope having a maximal amplitude value is located from time envelopes of all sub-frames of an input transient signal; adjusting an amplitude value of the time envelope of each sub-frame before the reference sub-frame in such a way that a first difference is greater than a preset first threshold, in which the first difference is a difference between the amplitude value of the time envelope of each sub-frame before the reference sub-frame and the amplitude value of the maximal time envelope; and writing the adjusted time envelope into bitstream.
    Type: Grant
    Filed: June 29, 2011
    Date of Patent: November 22, 2011
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Zexin Liu, Longyin Chen, Lei Miao, Chen Hu, Wei Xiao, Herve Marcel Taddei, Qing Zhang
  • Patent number: 8065141
    Abstract: A signal processing apparatus includes a decoding unit, an analyzing unit, a synthesizing unit, and a selecting unit. The decoding unit decodes an input encoded audio signal and outputs a playback audio signal. When loss of the encoded audio signal occurs, the analyzing unit analyzes the playback audio signal output before the loss occurs and generates a linear predictive residual signal. The synthesizing unit synthesizes a synthesized audio signal on the basis of the linear predictive residual signal. The selecting unit selects one of the synthesized audio signal and the playback audio signal and outputs the selected audio signal as a continuous output audio signal.
    Type: Grant
    Filed: August 24, 2007
    Date of Patent: November 22, 2011
    Assignee: Sony Corporation
    Inventor: Yuuji Maeda
  • Patent number: 8065139
    Abstract: There is described a method of encoding an input signal (20) to generate a corresponding encoded output signal (30), and also encoders (10) arranged to implement the method. The method comprises steps of: (a) distributing the input signal to sub-encoders (300, 310, 320) of the encoder (10); (b) processing the distributed input signal (20) at the sub-encoders (300, 310, 320) to generate corresponding representative parameter outputs (200, 210, 220) from the sub-encoders (300, 310, 320); and (c) combining the parameter outputs (200, 210, 220) of the sub-encoders (300, 310, 320) to generate the encoded output signal (30). Processing of the input signal (20) in the sub-encoders (300, 310, 320) involves segmenting the input signal (20) for analysis, such segments having associated temporal durations which are dynamically variable at least partially in response to information content present in the input signal (20).
    Type: Grant
    Filed: June 14, 2005
    Date of Patent: November 22, 2011
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Valery Stephanovich Kot
  • Patent number: 8065137
    Abstract: A system and apparatus for establishing whether a received signal frame is an audio signal frame is disclosed. In one embodiment, the system includes a predetermined position in an audio signal frame containing a piece of secondary information for an audio characteristic of the audio data, with a selection device for selecting a succession of bits which is arranged at the predetermined position in the received signal frame. A decision-making device flags the received signal frame as an audio signal frame if the succession of bits represents the piece of secondary information.
    Type: Grant
    Filed: February 9, 2007
    Date of Patent: November 22, 2011
    Assignee: Infineon Technologies AG
    Inventors: Norbert Metz, Johann Steger, Thomas Hauser, Martin Krueger
  • Patent number: 8050911
    Abstract: A system, method, apparatus, signal-bearing medium, and means for transmitting speech activity in a distributed voice recognition (VR) system. The distributed voice recognition system includes a local VR engine in a subscriber unit (102) and a server VR engine on a server (160). The local VR engine comprises a voice activity detection (VAD) module (106) that detects voice activity within a speech signal, and comprises an advanced feature extraction (AFE) module (104) that extracts features from a speech signal. The detected voice activity information is transmitted over a first wireless communication channel to the server (160). The feature extraction information is transmitted over a second wireless communication channel, separate from the first wireless communication channel, to the server (160). The server (160) processes the received information to determine a linguistic estimate of the electrical speech signal, and transmits the linguistic estimate to the subscriber unit (102).
    Type: Grant
    Filed: March 1, 2007
    Date of Patent: November 1, 2011
    Assignee: QUALCOMM Incorporated
    Inventor: Harinath Garudadri
  • Publication number: 20110257966
    Abstract: A method of providing voice updates is disclosed and may include receiving a voice update. The method may also include scheduling a voice update window. The voice update window may be a predetermined time window in which a voice update is broadcast.
    Type: Application
    Filed: April 19, 2010
    Publication date: October 20, 2011
    Inventor: Bohuslav Rychlik
  • Patent number: 8036884
    Abstract: The present invention provides a method, a computer-software-product and an apparatus for enabling a determination of speech related audio data within a record of digital audio data. The method comprises steps for extracting audio features from the record of digital audio data, for classifying one or more subsections of the record of digital audio data, and for marking at least a part of the record of digital audio data classified as speech. The classification of the digital audio data record is performed on the basis of the extracted audio features and with respect to at least one predetermined audio class.
    Type: Grant
    Filed: February 24, 2005
    Date of Patent: October 11, 2011
    Assignee: Sony Deutschland GmbH
    Inventors: Yin Hay Lam, Josep Maria Sola I Caros
  • Patent number: 8015000
    Abstract: An audio decoding system performs frame loss concealment (FLC) when portions of a bit stream representing an audio signal are lost within the context of a digital communication system. The audio decoding system employs two different FLC methods: one designed to perform well for music, and the other designed to perform well for speech. When a frame is deemed lost, the audio decoding system analyzes a previously-decoded audio signal corresponding to previously-decoded frames of an audio bit-stream. Based on the results of the analysis, the lost frame is classified as either speech or music. Using this classification, other signal analysis, and knowledge of the employed FLC methods, the audio decoding system selects the appropriate FLC method which then performs FLC on the lost frame.
    Type: Grant
    Filed: April 13, 2007
    Date of Patent: September 6, 2011
    Assignee: Broadcom Corporation
    Inventors: Robert W. Zopf, Juin-Hwey Chen, Jes Thyssen
  • Patent number: 8005666
    Abstract: An automatic system for temporal alignment between a music audio signal and lyrics is provided. The automatic system can prevent accuracy for temporal alignment from being lowered due to the influence of non-vocal sections. Alignment means of the system is provided with a phone model for singing voice that estimates phonemes corresponding to temporal-alignment features or features available for temporal alignment. The alignment means receives temporal-alignment features outputted from temporal-alignment feature extraction means, information on the vocal and non-vocal sections outputted from vocal section estimation means, and a phoneme network, and performs an alignment operation on condition that no phoneme exists at least in non-vocal sections.
    Type: Grant
    Filed: August 7, 2007
    Date of Patent: August 23, 2011
    Assignee: National Institute of Advanced Industrial Science and Technology
    Inventors: Masataka Goto, Hiromasa Fujihara, Hiroshi Okuno