Modification Of At Least One Characteristic Of Speech Waves (epo) Patents (Class 704/E21.001)

E Subclasses

Speech enhancement, e.g., noise reduction, echo cancellation, etc. (epo) (Class 704/E21.002)

Time compression or expansion (epo) (Class 704/E21.017)

Suppression or repetition of time signal segments (EPO) (Class 704/E21.018)

Transformation of speech into a nonaudible representation, e.g., speech visualization, speech processing for tactile aids, etc. (epo) (Class 704/E21.019)

Synchronization of speech with image or synthesis of the lips movement from speech, e.g., for "talking heads," etc.(EPO) (Class 704/E21.02)

SYSTEMS AND METHODS FOR RESPONDING TO NATURAL LANGUAGE SPEECH UTTERANCE

Publication number: 20090171664

Abstract: Systems and methods for receiving natural language queries and/or commands and execute the queries and/or commands. The systems and methods overcomes the deficiencies of prior art speech query and response systems through the application of a complete speech-based information query, retrieval, presentation and command environment. This environment makes significant use of context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created. The systems and methods creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command.

Type: Application

Filed: February 4, 2009

Publication date: July 2, 2009

Inventors: Robert A. Kennewick, David Locke, Michael R. Kennewick, SR., Michael R. Kennewick, JR., Richard Kennewick, Tom Freeman
Systems and methods for altering speech during cellular phone use

Publication number: 20090171670

Abstract: The present invention includes systems and methods for altering a cellular phone user's speech so that the speech can be less bothersome to third parties in the surrounding area and so that the user has more privacy. Sound cancellation can be used to cancel, reduce, or modify the user's voice so third parties cannot hear the voice as easily or so that the user's voice cannot be understood. Furthermore, the user device can encourage the user to speak in a lower voice. The user device can accomplish this encouragement by indicating to the user their level of speech. In this manner, the user knows when he may lower his voice and yet still provide an adequate volume of speech for the cellular phone. Additionally, the user device can encourage the user to speak in a lower voice by audibly playing back the user's voice in real time.

Type: Application

Filed: March 28, 2008

Publication date: July 2, 2009

Applicant: Apple Inc.

Inventors: Robert Bailey, Lawrence Heyl, Stephan Schell
METHOD AND APPARATUS OF AUDIO MATRIX ENCODING/DECODING

Publication number: 20090164225

Abstract: A method to audio matrix encode/decode, which encode and decode audio signals of two or more channels into an audio signal of one or more channel while preserving the direction of a sound image includes extracting pieces of sound image information from audio signals of multi channels, encoding and allocating the extracted sound image information to an inaudible frequency domain except an audible frequency domain, and adding the sound image information allocated to the inaudible frequency domain and matrix-encoded stereo signals of the audible frequency domain.

Type: Application

Filed: June 12, 2008

Publication date: June 25, 2009

Applicant: Samsung Electronics Co., Ltd.

Inventor: Sung-ho CHO
METHOD AND APPARATUS FOR ALIGNING PARALLEL SPOKEN LANGUAGE CORPORA

Publication number: 20090164208

Abstract: The method for aligning parallel spoken language corpora comprises obtaining a statistics method and dictionaries-based word alignment set from the parallel spoken language corpora, aligning chunks of the parallel spoken language corpora by using the statistics method and dictionaries-based word alignment set, to obtain a chunk alignment set, and aligning words in aligned chunks of the parallel spoken language corpora to obtain a chunk alignment-based word alignment set. Chunk alignment set and word alignment set are obtained by aligning chunks in parallel spoken language corpora in a corpus repository using a statistics method and dictionaries-based high precision word alignment set obtained from the parallel spoken language corpora and further aligning words in the chunks, and by using them in the speech-to-speech machine translation, the ambiguities of spoken language word alignment can be decreased by using the integrality of chunks.

Type: Application

Filed: December 16, 2008

Publication date: June 25, 2009

Inventors: Ren DENGJUN, Wu HUA, Wang HAIFENG
DEVICE WITH VOICE-ASSISTED SYSTEM

Publication number: 20090164215

Abstract: A device with a voice-assisted system is provided by using a voice command to adjust operations. The voice-assisted system includes a voice recognition engine and a control device. The voice recognition engine receives a voice command and outputting a voice signal based on the voice command to the control unit. The control unit based on the voice signal adjusts the operations. A user is only required to input the voice command. The voice recognition engine performs a series of actions to adjust the operations. Therefore, the voice-assisted system can enhance convenience of adjusting the operations of the device and reduce operation complexity for the user.

Type: Application

Filed: February 27, 2009

Publication date: June 25, 2009

Applicant: DELTA ELECTRONICS, INC.

Inventors: Yuan-Chia Lu, Liang-Sheng Huang, Jia-Lin Shen
Extraction and Matching of Characteristic Fingerprints from Audio Signals

Publication number: 20090157391

Abstract: An audio fingerprint is extracted from an audio sample, where the fingerprint contains information that is characteristic of the content in the sample. The fingerprint may be generated by computing an energy spectrum for the audio sample, resampling the energy spectrum logarithmically in the time dimension, transforming the resampled energy spectrum to produce a series of feature vectors, and computing the fingerprint using differential coding of the feature vectors. The generated fingerprint can be compared to a set of reference fingerprints in a database to identify the original audio content.

Type: Application

Filed: February 24, 2009

Publication date: June 18, 2009

Inventor: Sergiy Bilobrov
METHOD AND APPARATUS FOR TRAINING DIFFERENCE PROSODY ADAPTATION MODEL, METHOD AND APPARATUS FOR GENERATING DIFFERENCE PROSODY ADAPTATION MODEL, METHOD AND APPARATUS FOR PROSODY PREDICTION, METHOD AND APPARATUS FOR SPEECH SYNTHESIS

Publication number: 20090157409

Abstract: A method includes, generating, for each parameter of the prosody vector, an initial parameter prediction model with a plurality of attributes related to difference prosody prediction and at least part of attribute combinations of the plurality of attributes, in which each of the plurality of attributes and the attribute combinations is included as an item, calculating importance of each item in the parameter prediction model, deleting the item having the lowest importance calculated, re-generating a parameter prediction model with the remaining items, determining whether the re-generated parameter prediction model is an optimal model, and repeating the step of calculating importance and the steps following the step of calculating importance with the re-generated parameter prediction model, if the re-generated parameter prediction model is determined as not an optimal model, wherein the difference prosody vector and all parameter prediction models of the difference prosody vector constitute the difference pros

Type: Application

Filed: December 4, 2008

Publication date: June 18, 2009

Inventors: Yi Lifu, Li Jian, Lou Xiaoyan, Hao Jie
MDCT domain post-filtering apparatus and method for quality enhancement of speech

Publication number: 20090150143

Abstract: A post-filtering apparatus and method for speech enhancement in a modified discrete cosine transform (MDCT) domain are disclosed. In the apparatus and method, previous and current MDCT coefficients are used for obtaining a speech spectrum coefficient similar to a real speech spectrum, and a convex function is used for transforming the speech spectrum coefficient and obtaining a post-filter coefficient so that difference can increase in the case where the speech spectrum coefficient is small but decrease in the case where the coefficient is large. Then, the post-filter coefficient is applied to the MDCT coefficient. With this configuration, both the current and previous MDCT values are used, so that it is possible to obtain a spectrum coefficient similar to the real speech spectrum and to obtain a more accurate filter coefficient. Further, the coefficient is adaptively transformed through the convex function, thereby enhancing speech quality.

Type: Application

Filed: June 5, 2008

Publication date: June 11, 2009

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Hyun-woo Kim, Jong-mo Sung, Mi-suk Lee, Do-young Kim, Byung-sun Lee
Audio Coding System Using Temporal Shape of a Decoded Signal to Adapt Synthesized Spectral Components

Publication number: 20090144055

Abstract: A receiver in an audio coding system receives a signal conveying frequency subband signals representing an audio signal. The subband signals are examined to assess one or more characteristics of the audio signal including temporal shape. Spectral components are synthesized having the one or more assessed characteristics, integrated with the subband signals and passed through a synthesis filterbank to generate an output signal.

Type: Application

Filed: February 4, 2009

Publication date: June 4, 2009

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Grant Allen Davidson, Michael Mead Truman, Matthew Conrad Fellers, Mark Stuart Vinton
POINTING APPARATUS CAPABLE OF PROVIDING HAPTIC FEEDBACK, AND HAPTIC INTERACTION SYSTEM AND METHOD USING THE SAME

Publication number: 20090135164

Abstract: Provided are a pointing apparatus capable of providing haptic feedback, and a haptic interaction system and method using the same. The pointing apparatus includes a wireless communication unit, a controller, and a haptic stimulator. The wireless communication unit receives an event including haptic output information through wireless communication with the outside. The controller generates a control signal for reproducing a haptic pattern corresponding to the haptic output information. The haptic stimulator reproduces the haptic pattern by means of the control signal. Thus, it is possible to increase the performance and usability of a user interface of a user terminal including a touch screen.

Type: Application

Filed: November 21, 2008

Publication date: May 28, 2009

Inventors: Ki Uk KYUNG, Jun Young LEE, Jun Seok PARK, Chang Seok BAE, Dong Won HAN, Jin Tae KIM
CONVERSION DEVICE

Publication number: 20090132243

Abstract: A plurality of pairs of segments to be weighted/added are selected non-linearly with respect to a time axis of audio data. A speed conversion is achieved by performing the weighting/addition on the selected pairs of segments. The non-linear selection is performed by (a) obtaining all possible pairs of segments constituting the audio data, (b) calculating a degree of similarity pertaining to each possible pair, (c) ranking the all possible pairs of segments according to the degrees of similarity, and (d) overlapping at least one of the all possible pairs of segments that holds the highest degree of similarity.

Type: Application

Filed: January 23, 2007

Publication date: May 21, 2009

Inventor: Ryoji Suzuki
EFFICIENT METHOD FOR REUSING SCALE FACTORS TO IMPROVE THE EFFICIENCY OF AN AUDIO ENCODER

Publication number: 20090132238

Abstract: An audio encoding system that accepts an audio signal as an input to the system. The system includes a filter bank that splits the audio signal into a plurality of frames, and a bit allocation unit that assigns a number of bits for a current frame of the plurality of frames. The system further includes a scale factor unit that calculates a scale factor, identifies a block type of a first block of a current frame, identifies a block type of a second block consecutive to the first block, and reuses a scale factor of the first block for the second block, when the block type of the first block and the block type of the second block match. The system additionally includes a quantization and coding unit that quantizes and codes the signal, and a bit rate checker that verifies whether a bit rate requirement is satisfied.

Type: Application

Filed: October 31, 2008

Publication date: May 21, 2009

Inventor: B. SUDHAKAR
Audio Signal De-Identification

Publication number: 20090132239

Abstract: Techniques are disclosed for automatically de-identifying spoken audio signals. In particular, techniques are disclosed for automatically removing personally identifying information from spoken audio signals and replacing such information with non-personally identifying information. De-identification of a spoken audio signal may be performed by automatically generating a report based on the spoken audio signal. The report may include concept content (e.g., text) corresponding to one or more concepts represented by the spoken audio signal. The report may also include timestamps indicating temporal positions of speech in the spoken audio signal that corresponds to the concept content. Concept content that represents personally identifying information is identified. Audio corresponding to the personally identifying concept content is removed from the spoken audio signal. The removed audio may be replaced with non-personally identifying audio.

Type: Application

Filed: January 26, 2009

Publication date: May 21, 2009

Inventors: Michael Finke, Detlef Koll
APPARATUS AND METHOD FOR INSERTING/EXTRACTING CAPTURING RESISTANT AUDIO WATERMARK BASED ON DISCRETE WAVELET TRANSFORM, AUDIO RIGHTS PROTECTION SYSTEM USING THE SAME

Publication number: 20090125310

Abstract: An apparatus and method for embedding and extracting a capturing-resistant audio watermark based on discrete wavelet transform, and a copyright management system using the same are provided. The apparatus for embedding a wavelet based audio watermark includes: a framing unit for dividing an input audio signal into small signals with a regular length; a discrete wavelet transform unit for calculating an mean value of wavelet coefficients by transforming the small signals based on a discrete wavelet transform; and an embedding unit for changing the calculated mean value according to a watermark where a synchronization signal is inserted and inserting the watermark into the audio signal.

Type: Application

Filed: June 11, 2007

Publication date: May 14, 2009

Inventors: Seungjae Lee, Sang Kwang Lee, Jin Soo Seo, Young Ho Suh, Yong Seok Seo, Seon Hwa Lee, Won Gyum Kim, Wonyoung Yoo, Sung Hwan Lee, Hye Won Jung, Young Suk Yoon
System and a method for providing each user at multiple devices with speaker-dependent speech recognition engines via networks

Publication number: 20090125307

Abstract: A system and a method for providing each user at multiple devices with speaker-dependent speech recognition engines via networks according to the pre-stored speech sounds and characteristics of devices, by which each user can use speaker-dependent speech recognition engines in different devices without the need of repeating the same procedure of recording speech to train speech recognition engines for newly utilized devices.

Type: Application

Filed: November 9, 2007

Publication date: May 14, 2009

Inventor: Jui-Chang Wang
Rear Seat Entertainment System

Publication number: 20090119720

Abstract: The present rear seat entertainment system provides a second display and interface in the front section of a motor vehicle for control of a media player with a rear mounted first display. The second display shows still video images (or screen shots) from the media player for real time updates on the status of the first display in the rear section of the vehicle according to adjustments made by the second user interface. The entertainment system includes a portable controller with the second display incorporated therein.

Type: Application

Filed: September 7, 2006

Publication date: May 7, 2009

Inventors: Eric S Deuel, Peter W. Mokris, Steve Schultz, Lance E. Tinder, Douglas W. Klamer, Loren D. Vredevoogd, David Straight
SIGNAL PROCESSING METHOD, PROCESSING APPARATUS AND VOICE DECODER

Publication number: 20090119098

Abstract: The present invention discloses a signal processing method adapted to process a synthesized signal in packet loss concealment. The method includes the following steps: receiving a good frame following a lost frame, obtaining an energy ratio of energy of a signal in the signal of the good frame signal to energy of a synthesized signal corresponding to the same time of the good frame, and adjusting the synthesized signal in accordance with the energy ratio. The present invention also discloses a signal processing apparatus and a voice decoder.

Type: Application

Filed: November 4, 2008

Publication date: May 7, 2009

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Wuzhou ZHAN, Dongqi WANG, Yongfeng TU, Jing WANG, Qing ZHANG, Lei MIAO, Jianfeng XU, Chen HU, Yi YANG, Zhengzhong DU, Fengyan QI
Method of Encoding and Decoding an Audio Signal

Publication number: 20090119110

Abstract: An apparatus for encoding and decoding an audio signal and method thereof are disclosed, by which compatibility with a player of a general mono or stereo audio signal can be provided in coding an audio signal and by which spatial information for a multi-channel audio signal can be stored or transmitted without a presence of an auxiliary data area. The present invention includes extracting side information embedded in non-recognizable component of audio signal components and decoding the audio signal using the extracted side information.

Type: Application

Filed: May 26, 2006

Publication date: May 7, 2009

Applicant: LG ELECTRONICS

Inventors: Hyen-O Oh, Hee Suk Pang, Dong Soo Kim, Jae Hyun Lim, Yang-Won Jung
SPEECH ENHANCEMENT THROUGH PARTIAL SPEECH RECONSTRUCTION

Publication number: 20090112579

Abstract: A system improves speech intelligibility by reconstructing speech segments. The system includes a low-frequency reconstruction controller programmed to select a predetermined portion of a time domain signal. The low-frequency reconstruction controller substantially blocks signals above and below the selected predetermined portion. A harmonic generator generates low-frequency harmonics in the time domain that lie within a frequency range controlled by a background noise modeler. A gain controller adjusts the low-frequency harmonics to substantially match the signal strength to the time domain original input signal.

Type: Application

Filed: May 23, 2008

Publication date: April 30, 2009

Applicant: QNX SOFTWARE SYSTEMS (WAVEMAKERS), INC.

Inventors: Xueman Li, Rajeev Nongpiur, Frank Linseisen, Phillip A. Hetherington
FREE-SPEECH COMMAND CLASSIFICATION FOR CAR NAVIGATION SYSTEM

Publication number: 20090112605

Abstract: The present invention provides a system and method associating the freeform speech commands with one or more predefined commands from a set of predefined commands. The set of predefined commands are stored and alternate forms associated with each predefined command are retrieved from an external data source. The external data source receives the alternate forms associated with each predefined command from multiple sources so the alternate forms represent paraphrases of the predefined command. A representation including words from the predefined command and the alternate forms of the predefined command, such as a vector representation, is generated for each predefined command. A similarity value between received speech data and each representation of a predefined command is computed and the speech data is classified as the predefined command whose representation has the highest similarity value to the speech data.

Type: Application

Filed: October 27, 2008

Publication date: April 30, 2009

Inventor: Rakesh Gupta
VOICE ACQUISITION SYSTEM FOR A VEHICLE

Publication number: 20090106029

Abstract: A voice acquisition system for a vehicle includes an interior rearview mirror assembly. The mirror assembly may include a microphone for receiving audio signals within a cabin of the vehicle and generating an output indicative of these audio signals. The microphone may provide sound capture for a hands free cell phone system, an audio recording system and/or an emergency communication system. The system may include a control that is responsive to the output from the microphone and that distinguishes vocal signals from non-vocal signals present in the output. The microphone may provide sound capture for at least one accessory of the equipped vehicle, and the accessory may be responsive to a vocal signal captured by the microphone. The interior rearview mirror assembly may include at least one accessory, such as an antenna, a video device, a security system status indicator, a tire pressure indicator display and/or a loudspeaker.

Type: Application

Filed: December 19, 2008

Publication date: April 23, 2009

Applicant: DONNELLY CORPORATION

Inventors: Jonathan E. DeLine, Niall R. Lynam, Ralph A. Spooner, Phillip A. March
SYSTEM AND METHOD OF HANDLING PROBLEMATIC INPUT DURING CONTEXT-SENSITIVE HELP FOR MULTI-MODAL DIALOG SYSTEMS

Publication number: 20090094036

Abstract: A method of presenting a multi-modal help dialog move to a user in a multi-modal dialog system is disclosed. The method comprises presenting an audio portion of the multi-modal help dialog move that explains available ways of user inquiry and presenting a corresponding graphical action performed on a user interface associated with the audio portion. The multi-modal help dialog move is context-sensitive and uses current display information and dialog contextual information to present a multi-modal help move that is currently related to the user. A user request or a problematic dialog detection module may trigger the multi-modal help move.

Type: Application

Filed: November 7, 2008

Publication date: April 9, 2009

Applicant: AT&T Corp

Inventors: Patrick Ehlen, Helen Hastie, Michael Johnston
VOICE CONVERSION METHOD AND SYSTEM

Publication number: 20090089063

Abstract: A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum.

Type: Application

Filed: September 29, 2008

Publication date: April 2, 2009

Inventors: Fan Ping Meng, Yong Qin, Qin Shi, Zhi Wei Shuang
METHOD AND DEVICE FOR PERFORMING FRAME ERASURE CONCEALMENT TO HIGHER-BAND SIGNAL

Publication number: 20090076805

Abstract: The present invention discloses a method for performing a frame erasure concealment to a higher-band signal, including: calculating a periodic intensity of a higher-band signal with respect to a lower-band signal; judging whether the periodic intensity of the higher-band signal is higher than or equal to a preconfigured threshold; if the periodic intensity of the higher-band signal is higher than or equal to the preconfigured threshold, using a pitch period repetition method to perform the frame erasure concealment to the higher-band signal of a current lost frame; and if the periodic intensity of the higher-band signal is lower than the preconfigured threshold, using a previous frame data repetition method to perform the frame erasure concealment to the higher-band signal of the current lost frame. The present invention further discloses a device for performing a frame erasure concealment to a higher-band signal and a speech decoder. The problem that the quality of the voice signal is lowered is avoided.

Type: Application

Filed: May 29, 2008

Publication date: March 19, 2009

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Jianfeng Xu, Lei Miao, Chen Hu, Qing Zhang, Lijing Xu, Wei Li, Zhengzhong Du, Yi Yang, Fengyan Qi, Wuzhou Zhan, Dongqi Wang
Blind Watermarking of Audio Signals by Using Phase Modifications

Publication number: 20090076826

Abstract: Watermarking of audio signals intends to manipulate the audio signal in a way that the changes in the audio content cannot be recognised by the human auditory system. In order to reduce the audibility of the watermark and to improve the robustness of the watermarking the invention uses phase modification of the audio signal. In the frequency domain, the phase of the audio signal is manipulated by the phase of a reference phase sequence, followed by transform into time domain. Because a change of the audio signal phase over the whole frequency range can be audible, the phase manipulation is carried out with a maximum amount only within one or more small frequency ranges which are located in the higher frequencies and/or in noisy audio signal sections, according to psycho-acoustic principles. Preferably, the allowable amplitude of the phase changes in the remaining frequency ranges is controlled according to psycho-acoustic principles.

Type: Application

Filed: September 4, 2006

Publication date: March 19, 2009

Inventors: Walter Voessing, Peter Georg Baum
SYSTEM AND METHOD FOR PROVIDING AMR-WB DTX SYNCHRONIZATION

Publication number: 20090063165

Abstract: A system and method for providing improved adaptive multi-rate wideband (AMR-WB) discontinuous transmission (DTX) synchronization. According to various embodiments, an indication on the start of the inactive speech period is signalled to the decoder via a voice activity detection (VAD) flag a predetermined number of frames before the DTX period will start, i.e., before the SID_FIRST frame is received. When the VAD flag indicates active speech, or when the VAD flag has been set to zero less than the predetermined number of frames ago, the received NO_DATA frame can be classified with a high degree of reliability as active speech, i.e., considered as transmitter, network or terminal-initiated signalling, and can be substituted by a SPEECH_LOST frame. When the VAD flag was set to zero eight frames ago or earlier, the NO_DATA frame is classified as DTX.

Type: Application

Filed: August 27, 2008

Publication date: March 5, 2009

Inventors: Pasi Ojala, Ari Lakaniemi
BUZZ REDUCTION FOR LOW-COMPLEXITY FRAME ERASURE CONCEALMENT

Publication number: 20090055171

Abstract: A system is described that performs periodic waveform extrapolation based frame erasure concealment (FEC) to generate frames of an output speech signal corresponding to erased frames of encoded bit-stream in a manner reduces buzzy and tonal artifacts in the output speech signal. An embodiment of the invention uses a multiple of a pitch period associated with previously-decoded speech to perform periodic waveform extrapolation for consecutively-erased frames in a frame erasure beyond the first erased frame. An embodiment of the invention also attenuates the extrapolated signal after a threshold number of erased frames so as to reduce the FEC output signal to zero, wherein the threshold number of erased frames is dependent at least in part on the pitch period associated with the previously-decoded speech.

Type: Application

Filed: July 24, 2008

Publication date: February 26, 2009

Applicant: BROADCOM CORPORATION

Inventor: Robert W. Zopf
SOUND-SOURCE SEPARATION SYSTEM

Publication number: 20090043588

Abstract: A system capable of reducing the influence of sound reverberation or reflection to improve sound-source separation accuracy. An original signal X(?,f) is separated from an observed signal Y(?,f) according to a first model and a second model to extract an unknown signal E(?,f). According to the first model, the original signal X(?,f) of the current frame f is represented as a combined signal of known signals S(?,f?m+1) (m=1 to M) that span a certain number M of current and previous frames. This enables extraction of the unknown signal E(?,f) without changing the window length while reducing the influence of reverberation or reflection of the known signal S(?,f) on the observed signal Y(?,f).

Type: Application

Filed: August 7, 2008

Publication date: February 12, 2009

Applicant: HONDA MOTOR CO., LTD.

Inventors: Ryu Takeda, Kazuhiro Nakadai, Hiroshi Tsujino, Hiroshi Okuno
Apparatus and method of encoding and decoding audio signal

Publication number: 20090037186

Abstract: In one embodiment, the method includes receiving the audio signal having configuration information and multi-channels, and reading a first indicator from the configuration information. The first indicator indicates whether or not channel mapping information is included in the configuration information. The channel mapping information is read from the configuration information if the first indicator indicates that the channel mapping information is included in the configuration information. The channel mapping information indicates to which speaker in a reproduction device to map each channel in the audio signal. A second indicator is also read from the configuration information. The second indicator indicates whether or not channel rearrangement information is included in the configuration information. The channel rearrangement information is read from the configuration information if the second indicator indicates that the channel rearrangement information is included in the configuration information.

Type: Application

Filed: September 24, 2008

Publication date: February 5, 2009

Inventor: Tilman Liebchen
SYSTEM AND METHOD FOR MAPPING INTERFACE FUNCTIONALITY TO CODEC FUNCTIONALITY IN A PORTABLE AUDIO DEVICE

Publication number: 20090027355

Abstract: A portable digital audio device is capable of playing a number of different data file types, such as music data files, speech data files, video data files, and the like. Different CODECs are generally used for different data types. The system determines the data file type and selects the appropriate CODEC based on the reported data file type. In addition, the reported data file type is used to select the appropriate media interface manager and appropriate user interface. The user interface, or “skin” is selected for compatibility with the media interface manager and selected CODEC. The appropriate controls are enabled and displayed for user operation. As new CODECs are added to the system, appropriate media interface managers and skins are also added to provide the necessary user interface compatibility.

Type: Application

Filed: September 26, 2008

Publication date: January 29, 2009

Inventors: Edward C. Miller, Mark E. Phillips
ACOUSTIC SIGNAL ENCODING DEVICE, AND ACOUSTIC SIGNAL DECODING DEVICE

Publication number: 20090030704

Abstract: An acoustic signal encoding device for down-mixing at different ratios to encode a multichannel signal with a small number of channels, and an acoustic signal decoding device for decoding the signal encoded by the acoustic signal encoding device. In these devices, weighting means (103) in the acoustic signal encoding device (100) weights input signals of two channels individually according to a down-mixing coefficient thereby to calculate the level difference of the signals of two channels weighted by a level difference calculation unit (104). A separating unit (202) in the acoustic signal decoding device (200) separates the down-mixed signals into signals of two channels with the level difference information weighted.

Type: Application

Filed: October 13, 2005

Publication date: January 29, 2009

Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.

Inventors: Yoshiaki Takagi, Naoya Tanaka
Apparatus and method of encoding and decoding audio signal

Publication number: 20090030702

Abstract: In one embodiment, the method includes receiving an audio data frame having at least one channel. The channel is subdivided into a plurality of blocks, and at least two of the blocks are capable of having different lengths. The embodiment further includes obtaining indicator information indicating whether determining of a prediction order for each block is allowed, and determining the prediction order from the audio signal indicating the prediction order for each block if the indicator information indicating that determining of the prediction order for each block is allowed. The channel is decoded using the prediction order.

Type: Application

Filed: September 19, 2008

Publication date: January 29, 2009

Inventor: Tilman Liebchen
COMMUNICATION SYSTEM NOISE CANCELLATION POWER SIGNAL CALCULATION TECHNIQUES

Publication number: 20090024387

Abstract: In order to enhance the quality of a communication signal derived from speech and noise, a filter divides the communication signal into a plurality of frequency band signals. A calculator generates a plurality of power band signals each having a power band value and corresponding to one of the frequency band signals. The power band values are based on estimating, over a time period, the power of one of the frequency band signals. The time period is different for different ones of the frequency band signals. The power band values are used to calculate weighting factors which are used to alter the frequency band signals that are combined to generate an improved communication signal.

Type: Application

Filed: August 7, 2008

Publication date: January 22, 2009

Applicant: Tellabs Operations, Inc.

Inventors: Ravi Chandran, Bruce E. Dunne, Daniel J. Marchok
MULTIPLEXED AUDIO DATA DECODING APPARATUS AND RECEIVER APPARATUS

Publication number: 20090024400

Abstract: A multiplexed audio data decoder apparatus is provided in which integration of an audio decoder is easy, and has a high flexibility when the number of the formats to be processed is increased or when the specification is changed. In an external ROM 60 there are accumulated a plurality of decoding program codes corresponding to respective plural methods for compressing and encoding. A controller means 50 transfers the decoding program code corresponding to the method for compressing and encoding after changing thereof, from the external ROM 60 to an internal RAM 25. A DSP 22 starts decoding processing by using the decoding program code which is transmitted into the internal RAM 25.

Type: Application

Filed: September 23, 2008

Publication date: January 22, 2009

Inventors: YUKIO FUJII, SHINICHI OBATA, HIROAKI SHIRANE, EIJI YAMAMOTO
Multi-mode speech encoding system

Publication number: 20090024386

Abstract: A method comprises analyzing each frame of a plurality of frames of the speech signal to determine one or more speech parameters for the speech signal; deciding, for each frame of the plurality of frames of the speech signal, based on the one or more speech parameters of the speech signal, to select one of a plurality of encoding modes including a first encoding mode and a second encoding mode for encoding each frame of the plurality of frames of the speech signal; encoding each frame of the plurality of frames of the speech signal according to the selected one of the plurality of encoding modes for each frame of the plurality of frames in the deciding; the first encoding mode supports a first encoding rate and the second encoding mode supports a second encoding rate, wherein the first encoding rate is the same encoding rate as the encoding rate.

Type: Application

Filed: August 20, 2008

Publication date: January 22, 2009

Inventors: Huan-Yu Su, Yang Gao
SPEECH RECOGNITION SYSTEM AND METHOD

Publication number: 20090024391

Abstract: According to the present invention, a method for integrating processes with a multi-faceted human centered interface is provided. The interface is facilitated to implement a hands free, voice driven environment to control processes and applications. A natural language model is used to parse voice initiated commands and data, and to route those voice initiated inputs to the required applications or processes. The use of an intelligent context based parser allows the system to intelligently determine what processes are required to complete a task which is initiated using natural language. A single window environment provides an interface which is comfortable to the user by preventing the occurrence of distracting windows from appearing. The single window has a plurality of facets which allow distinct viewing areas. Each facet has an independent process routing its outputs thereto. As other processes are activated, each facet can reshape itself to bring a new process into one of the viewing areas.

Type: Application

Filed: September 29, 2008

Publication date: January 22, 2009

Applicant: EASTERN INVESTMENTS, LLC

Inventors: Richard Grant, Pedro E. McGregor
SYNCHRONIZATION OF AN INPUT TEXT OF A SPEECH WITH A RECORDING OF THE SPEECH

Publication number: 20090006087

Abstract: A method and system for synchronizing words in an input text of a speech with a continuous recording of the speech. A received input text includes previously recorded content of the speech to be reproduced. A synthetic speech corresponding to the received input text is generated. Ratio data including a ratio between the respective pronunciation times of words included in the received text in the generated synthetic speech is computed. The ratio data is used to determine an association between erroneously recognized words of the received text and a time to reproduce each erroneously recognized word. The association is outputted in a recording medium and/or displayed on a display device.

Type: Application

Filed: June 25, 2008

Publication date: January 1, 2009

Inventors: Noriko Imoto, Tetsuya Uda, Takatoshi Watanabe
Receiver Intelligibility Enhancement System

Publication number: 20080312916

Abstract: The intelligibility of speech signals is improved in the many situations where a voice signal is communicated or stored. Means and methods are disclosed for developing a scheme with high voice signal intelligibility without sacrifice of voice quality. The disclosed method comprises certain steps, including, but not limited to: Learning the noise on near-end side and enhancing the far-end voice as a function of the noise level on the near-end side. The disclosed method and apparatus are especially useful to increase the intelligibility of the cell phone's loudspeaker output. The invention includes the processing of an input speech signal to generate an enhanced intelligent signal. In frequency domain, the FFT spectrum of the speech received from the far-end is modified in accordance with the LPC spectrum of the local background noise to generate an enhanced intelligent signal. In time domain, the speech is modified in accordance with the LPC coefficients of the noise to generate an enhanced intelligent signal.

Type: Application

Filed: June 15, 2008

Publication date: December 18, 2008

Inventors: Alon Konchitsky, Alberto D. Berstein, Hariharan Ganapathy Kathirvelu, Sandeep Kulakcherla, William Martin Ribble
IMAGING APPARATUS, VOICE PROCESSING CIRCUIT, NOISE REDUCING CIRCUIT, NOISE REDUCING METHOD, AND PROGRAM

Publication number: 20080306733

Abstract: A noise reducing circuit includes a denoising unit configured to eliminate a noise band from an input voice signal; a noise recognizing unit configured to recognize noise included in the voice signal; a denoising period generating unit configured to generate a signal indicating a denoising period in accordance with an occurrence period of the recognized noise; and a selecting unit configured to select an output of the denoising unit when the denoising period is indicated and select the voice signal when the denoising period is not indicated.

Type: Application

Filed: March 13, 2008

Publication date: December 11, 2008

Applicant: Sony Corporation

Inventor: Kazuhiko OZAWA
METHOD AND SYSTEM FOR CREATION AND USE OF A WIDEBAND VOCODER DATABASE FOR BANDWIDTH EXTENSION OF VOICE

Publication number: 20080300866

Abstract: The invention concerns a system (300) and method (400) for bandwidth extension of voice for improving the quality of voice in a communication system. The method and system include the steps of filtering (402) a wideband voice signal to produce a first filtered signal (301) and a second filtered signal (331), vocoding (404) the first filtered signal to produce a narrowband vocoded signal (130), compensating (406) the second filtered signal for time alignment with the narrowband vocoded signal, and adding (335) the narrowband vocoded signal with the second filtered signal to produce a wideband vocoded signal (250). One or more features from the wideband vocoded signal can be extracted to create a wideband feature vector (147) for storage in a wideband vocoded speech database (220).

Type: Application

Filed: May 31, 2006

Publication date: December 4, 2008

Applicant: MOTOROLA, INC.

Inventors: ADEEL MUKHTAR, DEEPAK P. AHYA
SYSTEMS AND METHODS OF A STRUCTURED GRAMMAR FOR A SPEECH RECOGNITION COMMAND SYSTEM

Publication number: 20080300886

Abstract: In embodiments of the present invention, a system and method for enabling a user to interact with a computer platform using a voice command may comprise the steps of defining a structured grammar for handling a global voice command, defining a global voice command of the structured grammar wherein the global voice command enables access to an object of the computer platform using a single command, and mapping at least one function of the object to the global voice command, wherein upon receiving voice input from the user of the computer platform the object recognizes the global voice command and controls the function.

Type: Application

Filed: May 19, 2008

Publication date: December 4, 2008

Inventor: Kimberly Patch
Noise reduction device, program and method

Publication number: 20080294430

Abstract: A noise reduction device is configured by use of: means for calculating a predetermined constant, and a predetermined reference signal R?(T) in the frequency domain, respectively by use of adaptive coefficients W?(m), and for thereby obtaining estimated values N? and Q?(T) respectively of stationary noise components, and non-stationary noise components corresponding to the reference signal, which are included in a predetermined observed signal X?(T) in the frequency domain; means and for applying a noise reduction process to the observed signal on the basis of each of the estimated values, and for updating each of the adaptive coefficients on the basis of a result of the process; and an adaptive learning means and for repeating the obtaining of the estimated values and the updating of the adaptive coefficients, and for thereby learning each of the adaptive coefficients.

Type: Application

Filed: August 5, 2008

Publication date: November 27, 2008

Inventor: Osamu Ichikawa
SPEECH RECOGNITION MACRO RUNTIME

Publication number: 20080288259

Abstract: The disclosed speech recognition system enables users to define personalized, context-aware voice commands without extensive software development. Command sets may be defined in a user-friendly language and stored in an eXtensible Markup Language (XML) file. Each command object within the command set may include one or more user configurable actions, one or more configurable rules, and one or more configurable conditions The command sets may be managed by a command set loader, that loads and processes each command set into computer executable code. The command set loader may enable and disable command sets. A macro processing component may provide a speech recognition grammar to an API of the speech recognition engine based on currently enabled commands. When the speech recognition engine recognizes user speech consistent with the grammar, the macro processing component may initiate the one or more computer executable actions.

Type: Application

Filed: March 18, 2008

Publication date: November 20, 2008

Applicant: Microsoft Corporation

Inventors: Robert L. Chambers, Brian King
SPEECH PROCESSING METHOD AND APPARATUS, STORAGE MEDIUM, AND SPEECH SYSTEM

Publication number: 20080281588

Abstract: A speech processing apparatus includes a spectrum envelope extracting unit which extracts the spectrum envelope of an input speech signal, a spectrum envelope deforming unit which applies deformation to the spectrum envelope to generate a deformed spectrum envelope, a spectrum fine structure extracting unit which extracts the spectrum fine structure of the input speech signal, a deformed spectrum generating unit which generates a deformed spectrum by combining the deformed spectrum envelope with the spectrum fine structure, and a speech generating unit which generates an output speech signal on the basis of the deformed spectrum. This apparatus emits a disrupting sound based on the output speech signal to prevent a third party from eavesdropping on a conversation.

Type: Application

Filed: August 31, 2007

Publication date: November 13, 2008

Applicants: JAPAN ADVANCED INSTITUTE OF SCIENCE AND TECHNOLOGY, GLORY LTD.

Inventors: Masato Akagi, Rieko Futonagane, Yoshihiro Irie, Hisakazu Yanagiuchi, Yoshitane Tanaka
Noise Suppression Device and Noise Suppression Method

Publication number: 20080281589

Abstract: There is disclosed a noise suppression device capable of improving the noise suppression accuracy while reducing the audio distortion. In this device, a suppression unit suppresses a noise component from the audio power spectrum by using the detection result of the audio-existing band and the noise band in the audio power spectrum including the noise component. A pitch harmonic structure extracting unit (105) extracts a pitch harmonic power spectrum from the audio power spectrum. An audio-existence judgment unit (106) judges whether the audio power spectrum has audio existence according to the extracted pitch harmonic power spectrum. A pitch harmonic structure repair unit (108) repairs the extracted pitch harmonic power spectrum.

Type: Application

Filed: May 30, 2005

Publication date: November 13, 2008

Applicant: MATSUSHITA ELECTRIC INDUSTRAIL CO., LTD.

Inventors: Youhua Wang, Takuya Kawashima, Koji Yoshida
Scale Searching for Watermark Detection

Publication number: 20080275710

Abstract: The present invention relates to a method, device (12) and computer program product for enabling detection of additional data embedded in a media signal that may have been subjected to scaling. The invention also relates to an additional data detecting device (10) comprising such a device for enabling detection. An envelope discriminating unit (ED) provides a first extracted narrow band envelope signal sample (we[n]) from an input media signal sample (yb[n]), and a variable scale down sampling unit (VSDS) down samples the narrow band envelope signal sample using a down sampling rate that is dependent on a scaling factor variable value (?) for providing at least one sample of a first additional data estimate (wn[k]) in order to allow the detection of additional data in said signal sample.

Type: Application

Filed: June 29, 2005

Publication date: November 6, 2008

Applicant: KONINKLIJKE PHILIPS ELECTRONICS, N.V.

Inventors: Aweke Negash Lemma, Leon Maria Van De Kerkhof, Javier Francisco Aprea
Apparatus for Vocal-Cord Signal Recognition and Method Thereof

Publication number: 20080270126

Abstract: Provided are a vocal-cord recognition apparatus and a method thereof. The vocal-cord signal recognition apparatus includes a vocal-cord signal extracting unit for analyzing a feature of a vocal-cord signal inputted through a throat microphone, and extracting a vocal-cord feature vector from the vocal-cord signal using the analyzing data; and a vocal-cord signal recognition unit for recognizing the vocal-cord signal by extracting the feature of the vocal-cord signal using the vocal-cord signal feature vector extracted at the vocal-cord signal extracting means.

Type: Application

Filed: October 19, 2006

Publication date: October 30, 2008

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Young-Giu Jung, Mun-Sung Han, Kwan-Hyun Cho, Jun-Seok Park
METHOD AND APPARATUS FOR ENCODING AND DECODING AUDIO/SPEECH SIGNAL

Publication number: 20080270124

Abstract: Provided is a method of encoding an audio/speech signal, the method including determining a variable length of a frame, that is, a processing unit of an input signal in accordance with a position of an attack in the input signal; transforming each frame of the input signal to a frequency domain and dividing the frame into a plurality of sub frequency bands; and, if a signal of a sub frequency band is determined to be encoded in the frequency domain, encoding the signal of the sub frequency band in the frequency domain, and if the signal of the sub frequency band is determined to be encoded in a time domain, inverse transforming the signal of the sub frequency band to the time domain and encoding the inverse transformed signal in the time domain. According to the present invention, the audio/speech signal may be efficiently encoded by controlling time resolution and frequency resolution.

Type: Application

Filed: October 15, 2007

Publication date: October 30, 2008

Applicant: Samsung Electronics Co., Ltd

Inventors: Chang-yong SON, Eun-mi Oh, Jung-hoe Kim, Ho-sang Sung, Kang-eun Lee, Ki-hyun Choo
Method for Encoding and Decoding Multi-Channel Audio Signal and Apparatus Thereof

Publication number: 20080262854

Abstract: Methods and apparatuses for encoding and decoding a multi-channel audio signal are provided. In the encoding method, spatial information that is calculated based on a multi-channel audio signal and a downmix signal is encoded, and additional configuration information is generated based on information that is selected from the encoded spatial information. The downmix signal is encoded, and then, a bitstream is generated by combining the encoded downmix signal with the encoded spatial information. Thereafter, the additional configuration information is inserted into the bitstream. Therefore, it is possible to configure an optimum bitstream according to the circumstances by retransmitting all or part of information included in a header.

Type: Application

Filed: October 20, 2006

Publication date: October 23, 2008

Applicant: LG ELECTRONICS, INC.

Inventors: Yang-Won Jung, Hee Suk Pang, Hyen-O Oh, Dong Soo Kim, Jae Hyun Lim
Audio Encoding and Decoding

Publication number: 20080255856

Abstract: An audio encoder (109) has a hierarchical encoding structure and generates a data stream comprising one or more audio channels as well as parametric audio encoding data. The encoder (109) comprises an encoding structure processor (305) which inserts decoder tree structure data into the data stream. The decoder tree structure data comprises at least one data value indicative of a channel split characteristic for an audio channel at a hierarchical layer of the hierarchical decoder structure and may specifically specify the decoder tree structures to be applied by a decoder A decoder (115) comprises a receiver (401) which receives the data stream and a decoder structure processor (405) for generating the hierarchical decoder structure in response to the decoder tree structure data. A decode processor (403) then generates output audio channels from the data stream using the hierarchical decoder structure.

Type: Application

Filed: July 7, 2006

Publication date: October 16, 2008

Applicant: KONINKLIJKE PHILIPS ELECTRONCIS N.V.

Inventors: Erik Gosuinus Petrus Schuijers, Gerard Herman Hotho, Heiko Purnhagen, Wolfgang Schildbach, Holger Horich, Hans Magnus Kristofer Kjorling, Karl Jonas Roden

prev … 8 9 10 11 12 13 next