Speech Or Audio Signal Analysis-synthesis Techniques For Redundancy Reduction, E.g., In Vocoders, Etc.; Coding Or Decoding Of Speech Or Audio Signals; Compression Or Expansion Of Speech Or Audio Signals, E.g., Source-filter Models, Psychoacoustic Analysis, Etc. (epo) Patents (Class 704/E19.001)

  • Publication number: 20120109644
    Abstract: There is a need to enable decompression of a speech signal even if no network synchronizing signal is output from a baseband processing portion. For this purpose, an information processing device includes a first serial interface. The first serial interface includes a notification signal generation circuit that generates a notification signal each time compressed data incorporated from the baseband processing portion reaches a predetermined data quantity, and notifies a speech processing portion of this state using the notification signal. The speech processing portion includes a synchronizing signal generation circuit that generates a network synchronizing signal based on the notification signal. A clock signal for PCM communication is generated based on the network synchronizing signal. A speech signal can be decompressed even if no network synchronizing signal is output from the baseband processing portion.
    Type: Application
    Filed: October 31, 2011
    Publication date: May 3, 2012
    Inventors: Yutaka Uchimura, Takahiro Irita, Jiro Hara
  • Publication number: 20120109646
    Abstract: A speaker adaptation method and apparatus are provided including extracting adapted data from speech recognition data stored in a database, and modifying a sound model by using a speaker adaptation method selected based on a type of the extracted adapted data.
    Type: Application
    Filed: September 2, 2011
    Publication date: May 3, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Eun-Sang BAK
  • Publication number: 20120101814
    Abstract: Various techniques are disclosed for improving packet loss concealment to reduce artifacts by using audio character measures of the audio signal. These techniques include attenuation to a noise fill instead of attenuation to silence, varying how long to wait before attenuating the extrapolation, varying the rate of attenuation of the extrapolation, attenuating periodic extrapolation at a different rate than non-periodic extrapolation, and performing period extrapolation on successively longer fill data based on the audio character measures, adjusting weighting between periodic and non-periodic extrapolation based on the audio character measures, and adjusting weighting between periodic extrapolation and non-periodic extrapolation non-linearly.
    Type: Application
    Filed: October 25, 2010
    Publication date: April 26, 2012
    Applicant: POLYCOM, INC.
    Inventor: Eric David Elias
  • Publication number: 20120101813
    Abstract: A mixed time-domain/frequency-domain coding device and method for coding an input sound signal, wherein a time-domain excitation contribution is calculated in response to the input sound signal. A cut-off frequency for the time-domain excitation contribution is also calculated in response to the input sound signal, and a frequency extent of the time-domain excitation contribution is adjusted in relation to this cut-off frequency. Following calculation of a frequency-domain excitation contribution in response to the input sound signal, the adjusted time-domain excitation contribution and the frequency-domain excitation contribution are added to form a mixed time-domain/frequency-domain excitation constituting a coded version of the input sound signal. In the calculation of the time-domain excitation contribution, the input sound signal may be processed in successive frames of the input sound signal and a number of sub-frames to be used in a current frame may be calculated.
    Type: Application
    Filed: October 25, 2011
    Publication date: April 26, 2012
    Applicant: VOICEAGE CORPORATION
    Inventors: Tommy Vaillancourt, Milan Jelinek
  • Publication number: 20120101827
    Abstract: Methods and apparatus to extract data encoded in media content are disclosed. An example method includes sampling a media content signal to generate digital samples, determining a frequency domain representation of the digital samples, determining a first rank of a first frequency in the frequency domain representation, determining a second rank of a second frequency in the frequency domain representation, combining the first rank and the second rank with a set of ranks to create a combined set of ranks, comparing the combined set of ranks to a set of reference sequences, determining a data represented by the combined set of ranks based on the comparison, and storing the data in a memory device.
    Type: Application
    Filed: December 30, 2011
    Publication date: April 26, 2012
    Inventors: Alexander Pavlovich Topchy, Venugopal Srinivasan
  • Publication number: 20120102538
    Abstract: An embodiment of the present invention discloses a system and method for decoding multiple independent encoded audio streams using a single decoder. The system includes one or more parsers, a preprocessor, an audio decoder, and a renderer. The parser extracts individual audio frames from each input audio stream. The preprocessor combines the outputs of all parsers into a single audio frame stream and enables sharing of the audio decoder among multiple independent encoded audio streams. The audio decoder decodes the single audio frame stream and provides a single decoded audio stream. And the renderer renders the individual reconstructed audio streams from the single decoded audio stream.
    Type: Application
    Filed: October 22, 2010
    Publication date: April 26, 2012
    Applicants: STMICROELECTRONICS (GRENOBLE) SAS, STMICROELECTRONICS PVT. LTD
    Inventors: Rahul Bansal, Philippe Monnier, Shiv Kumar Singh, Kausik Maiti, Nitin Jain
  • Publication number: 20120101824
    Abstract: Systems and methods for enhancing the quality of an audio signal produced by an audio codec are described herein. In accordance with the systems and methods, a pitch-based pre-filter adaptively filters an input audio signal to produce a filtered audio signal. An audio encoder encodes the filtered audio signal to generate a compressed audio bit stream. An audio decoder decodes the compressed audio bit stream to generate a decoded audio signal. A pitch-based post-filter adaptively filters the decoded audio signal to produce an output audio signal, wherein adaptively filtering the decoded audio signal comprises undoing at least part of a signal-shaping effect of the pitch-based pre-filter.
    Type: Application
    Filed: June 29, 2011
    Publication date: April 26, 2012
    Applicant: BROADCOM CORPORATION
    Inventor: Juin-Hwey Chen
  • Publication number: 20120101825
    Abstract: Method and apparatus for encoding/decoding audio data with scalability are provided. The method includes slicing audio data to correspond to a plurality of layers, obtaining scale band information and coding band information corresponding to each of the plurality of layers, coding additional information containing scale factor information and coding model information based on scale band information and coding band information corresponding to a first layer, obtaining quantized samples by quantizing audio data corresponding to the first layer with reference to the scale factor information, coding the obtained plurality of quantized samples in units of symbols in order from respective symbols formed with most significant bits (MSB) to least significant bits (LSB) based on the coding model information, and repeating, until coding for the plurality of layers is finished. Accordingly, fine grain scalability can be provided with lower complexity and better audio quality even in a lower layer.
    Type: Application
    Filed: October 24, 2011
    Publication date: April 26, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jung-hoe KIM, Sang-wook Kim, Eun-mi Oh
  • Publication number: 20120095769
    Abstract: Embodiments of the present invention disclose an audio decoding method, including: determining that bitstreams to be decoded are monophony coding layer and first stereo enhancement layer bitstreams; decoding the monophony coding layer to obtain a monophony decoded frequency-domain signal; reconstructing left and right channel frequency-domain signals in a first sub-band region by utilizing the monophony decoded frequency-domain signal after an energy adjustment; and reconstructing left and right channel frequency-domain signals in a second sub-band region by utilizing the monophony decoded frequency-domain signal without the energy adjustment.
    Type: Application
    Filed: November 14, 2011
    Publication date: April 19, 2012
    Applicant: Huawei Technologies Co., Ltd.
    Inventors: Qi Zhang, Libin Zhang
  • Publication number: 20120095760
    Abstract: Various embodiments of the invention provide scalable and distributed input signal coding activity detection and coding thereof (e.g. VAD/DTX) processing framework. An apparatus comprising an encoder is shown. The apparatus can be a terminal, for example a mobile phone, computer or the like. The apparatus may act as transmitter etc. The apparatus is coupled with a network element (alternatively referred to as an intermediate element or the like). The network element is coupled with apparatuses. The apparatuses can also be terminal devices such as mobile phone, computer or the like. The apparatuses may act as receivers etc. The apparatus comprises a detector configured to detect whether an input signal is active input signal or non-active input signal. There are various different detectors such as the VAD or SAD referred to above.
    Type: Application
    Filed: December 19, 2008
    Publication date: April 19, 2012
    Inventor: Pasi S. Ojala
  • Publication number: 20120095756
    Abstract: Proposed is a method and apparatus for determining a weighting function for quantizing a linear predictive coding (LPC) coefficient and having a low complexity. The weighting function determination apparatus may convert an LPC coefficient of a mid-subframe of an input signal to one of a immitance spectral frequency (ISF) coefficient and a line spectral frequency (LSF) coefficient, and may determine a weighting function associated with an importance of the ISF coefficient or the LSF coefficient based on the converted ISF coefficient or LSF coefficient.
    Type: Application
    Filed: May 26, 2011
    Publication date: April 19, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ho Sang Sung, Eun Mi Oh
  • Publication number: 20120095749
    Abstract: Audiovisual presentation methods, systems and apparatus for improving and enhancing the listening experience of attendees of audiovisual presentations. An exemplary audiovisual presentation system includes an audio processing and distribution unit (APDU) configured to generate and broadcast a wireless audio service containing audio of an audiovisual presentation (e.g., soundtrack and dialogue audio of a movie, in the case of a movie presentation) throughout an audiovisual presentation room or space (e.g., a movie theater, in the case of a movie presentation). The wireless audio service is received by mobile receiving devices (MRDs) having or comprising headsets, headphones or earbuds, through which MRD users listen to the audio of the audiovisual presentation provided by the wireless audio service while viewing images of the audiovisual presentation.
    Type: Application
    Filed: October 13, 2011
    Publication date: April 19, 2012
    Inventor: Antonio Capretta
  • Publication number: 20120087416
    Abstract: Presented herein are system(s), method(s), and apparatus for rapid switching between streams of data. In one embodiment, there is described a circuit for providing media. The circuit comprises a multiplexed stream processor, a queue, and a decoder. The multiplexed stream processor receives a multiplexed stream and filtering at least one elementary stream. The queue queues the at least one elementary stream. The decoder decodes the at least one elementary stream. The multiplexed stream processor filters at least another elementary stream instead of at least one elementary stream after issuance of a command to switch from the at least one elementary stream to the at least another elementary stream. The queue stores a portion of the at least one elementary stream after issuance of the command, said portion of the at least one elementary stream being written into the queue before issuance of the command.
    Type: Application
    Filed: April 5, 2011
    Publication date: April 12, 2012
    Inventor: Tim Ross
  • Publication number: 20120084089
    Abstract: The present disclosure includes processing a signal to generate a first sub-set of data, transmitting the first sub-set of data for generation of a reconstructed audio signal, the reconstructed audio signal having a fidelity relative to the signal, processing the signal to generate a second sub-set of data and a third sub-set of data, the second sub-set of data defining a second portion of the signal and comprising data that is different than data of the first sub-set of data, and the third sub-set of data defining a third portion of the signal and comprising data that is different than data of the first and second sub-sets of data, comparing a priority of the second sub-set of data to a priority of the third sub-set of data, and transmitting one of the second sub-set of data and the third sub-set of data over the network for improving the fidelity.
    Type: Application
    Filed: September 30, 2011
    Publication date: April 5, 2012
    Applicant: GOOGLE INC.
    Inventors: Matthew I. Lloyd, Martin Jansche
  • Publication number: 20120082319
    Abstract: A method and apparatus processes multi-channel audio by encoding, transmitting or recording “dry” audio tracks or “stems” in synchronous relationship with time-variable metadata controlled by a content producer and representing a desired degree and quality of diffusion. Audio tracks are compressed and transmitted in connection with synchronized metadata representing diffusion and preferably also mix and delay parameters. The separation of audio stems from diffusion metadata facilitates the customization of playback at the receiver, taking into account the characteristics of local playback environment.
    Type: Application
    Filed: September 8, 2011
    Publication date: April 5, 2012
    Inventors: Jean-Marc Jot, Stephen Roger Hastings, James D. Johnston
  • Publication number: 20120084078
    Abstract: A scalable voice signature authentication capability is provided herein. The scalable voice signature authentication capability enables authentication of varied services such as speaker identification (e.g. private banking and access to healthcare account records), voice signature as a password (e.g. secure access for remote services and document retrieval) and the Internet and its various services (e.g.
    Type: Application
    Filed: September 30, 2010
    Publication date: April 5, 2012
    Applicant: Alcatel-Lucent USA Inc.
    Inventors: Madhav Moganti, Anish Sankalia
  • Publication number: 20120078642
    Abstract: A method of encoding a multi-object audio signal and an encoding apparatus, a decoding method and a decoding apparatus, and a transcoding method and a transcoder are provided. A multi-object audio signal encoding apparatus may encode object signals obtained by excluding ForeGround Objects (FGOs) from a plurality of input object signals, and may encode the FGOs, thereby providing a listener with a satisfactory sound quality.
    Type: Application
    Filed: June 10, 2010
    Publication date: March 29, 2012
    Inventors: Jeong Il Seo, Kyeong Ok Kang
  • Publication number: 20120078640
    Abstract: An audio encoding device includes, a time-frequency transformer that transforms signals of channels, a first spatial-information determiner that generates a frequency signal of a third channel, a second spatial-information determiner that generates a frequency signal of the third channel, a similarity calculator that calculates a similarity between the frequency signal of the at least one first channel and the frequency signal of the at least one second channel, a phase-difference calculator that calculates a phase difference between the frequency signal of the at least one first channel and the signal of the at least one second channel, a controller that controls determination of the first spatial information when the similarity and the phase difference satisfy a predetermined determination condition, a channel-signal encoder that encodes the frequency signal of the third channel, and a spatial-information encoder that encodes the first spatial information or the second spatial information.
    Type: Application
    Filed: July 6, 2011
    Publication date: March 29, 2012
    Applicant: Fujitsu Limited
    Inventors: Miyuki SHIRAKAWA, Yohei Kishi, Masanao Suzuki, Yoshiteru Tsuchinaga
  • Publication number: 20120072224
    Abstract: The present invention relates to a method of text-based speech synthesis, wherein at least one portion of a text is specified; the intonation of each portion is determined; target speech sounds are associated with each portion; physical parameters of the target speech sounds are determined; speech sounds most similar in terms of the physical parameters to the target speech sounds are found in a speech database; and speech is synthesized as a sequence of the found speech sounds. The physical parameters of said target speech sounds are determined in accordance with the determined intonation. The present method, when used in a speech synthesizer, allows improved quality of synthesized speech due to precise reproduction of intonation.
    Type: Application
    Filed: November 23, 2011
    Publication date: March 22, 2012
    Inventor: Mikhail Vasilievich KHITROV
  • Publication number: 20120069899
    Abstract: An encoder performs context-adaptive arithmetic encoding of transform coefficient data. For example, an encoder switches between coding of direct levels of quantized transform coefficient data and run-level coding of run lengths and levels of quantized transform coefficient data. The encoder can determine when to switch between coding modes based on a pre-determined switch point or by counting consecutive coefficients having a predominant value (e.g., zero). A decoder performs corresponding context-adaptive arithmetic decoding.
    Type: Application
    Filed: November 29, 2011
    Publication date: March 22, 2012
    Applicant: Microsoft Corporation
    Inventors: Sanjeev Mehrotra, Wei-Ge Chen
  • Publication number: 20120072207
    Abstract: Provided are a down-mixing method and an encoder, wherein a high quantization performance can be realized when a balance adjustment operation due to a balance weight coefficient and a removal operation of a main component are combined. In the encoder (100), a down-mixing unit (101) generates a mono signal by multiplying an L-signal and an R-signal by coefficients a and ss, respectively, and summing the L-signal and the R-signal to generate a mono signal. A first encoding target signal, corresponding to the L-signal is generated by multiplying the mono signal by a balance weight coefficient wL and subtracting the same from the L-signal, using a multiplier (107) and an adder (109). A second encoding target signal, corresponding to the R-signal is generated by multiplying the mono signal by a balance weight coefficient wR and subtracting the same from the R-signal, using a multiplier (108) and an adder (110).
    Type: Application
    Filed: June 1, 2010
    Publication date: March 22, 2012
    Applicant: PANASONIC CORPORATION
    Inventor: Toshiyuki Morii
  • Publication number: 20120065983
    Abstract: The present document relates to audio coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), and to digital effect processors, e.g. so-called exciters, where generation of harmonic distortion adds brightness to the processed signal.
    Type: Application
    Filed: May 25, 2010
    Publication date: March 15, 2012
    Applicant: DOLBY INTERNATIONAL AB
    Inventors: Per Ekstrand, Lars Villemoes, Per Hedelin
  • Publication number: 20120065964
    Abstract: Techniques for introducing information into a data stream first obtains the spectral values of the short-term spectrum of the audio signal. Separately, information to be introduced are combined with a spread sequence obtaining a spread information signal, whereupon a spectral representation of the spread information is generated, then weighted with an established psychoacoustic maskable noise energy to generate a weighted information signal, wherein energy of the introduced information is substantially equal to or below the psychoacoustic masking threshold. The weighted information signal and the spectral values of the short-term spectrum of the audio signal are then summed and afterwards processed again to obtain a processed data stream including audio information and information to be introduced.
    Type: Application
    Filed: November 21, 2011
    Publication date: March 15, 2012
    Inventors: Christian NEUBAUER, Juergen HERRE, Karlheinz BRANDENBURG, Eric ALLAMANCHE
  • Publication number: 20120057026
    Abstract: A monitoring system includes a monitoring apparatus, a television, and a number of cameras. The monitoring apparatus includes an image analog-to-digital (A/D) converting unit to receive images from the cameras and convert the received analog images to digital images. A microcontroller receives the digital images from the image A/D converting unit, and compresses the received digital images and stores the compressed digital images in the storage unit. An image digital-to-analog (D/A) converting unit receives the digital images from the microcontroller and converts the received digital images to analog images. When the microcontroller receives the control signal from the television, the microcontroller outputs the received digital images to the image D/A converting unit, to convert the received digital images to analog images and output the analog images to the television. The microcontroller compresses the received digital images and stores the compressed digital images in a storage unit.
    Type: Application
    Filed: September 24, 2010
    Publication date: March 8, 2012
    Applicant: HON HAI PRECISION INDUSTRY CO., LTD.
    Inventor: MING-YUAN HSU
  • Publication number: 20120059659
    Abstract: Provided are, among other things, systems, methods and techniques for merging entropy codebook application ranges within an audio signal. According to one embodiment, an audio signal is obtained, the audio signal including quantization indexes, identification of segments of said quantization indexes, and indexes of entropy codebooks that have been assigned to such segments, with a single entropy codebook index having been assigned to each said segment; potential merging operations in which specified segments potentially would be merged with each other are identified; bit penalties are estimated for the potential merging operations; then, the potential merging operation having the lowest estimated the penalty is performed.
    Type: Application
    Filed: August 23, 2011
    Publication date: March 8, 2012
    Inventor: Yuli You
  • Publication number: 20120057715
    Abstract: A method and apparatus processes multi-channel audio by encoding, transmitting or recording “dry” audio tracks or “stems” in synchronous relationship with time-variable metadata controlled by a content producer and representing a desired degree and quality of diffusion. Audio tracks are compressed and transmitted in connection with synchronized metadata representing diffusion and preferably also mix and delay parameters. The separation of audio stems from diffusion metadata facilitates the customization of playback at the receiver, taking into account the characteristics of local playback environment.
    Type: Application
    Filed: February 7, 2011
    Publication date: March 8, 2012
    Inventors: James D. Johnston, Stephen Roger Hastings, Jean-Marc Jot
  • Publication number: 20120053931
    Abstract: A non-acoustic sensor is used to measure a user's speech and then broadcasts an obscuring acoustic signal diminishing the user's vocal acoustic output intensity and/or distorting the voice sounds making them unintelligible to persons nearby.
    Type: Application
    Filed: February 1, 2011
    Publication date: March 1, 2012
    Applicant: Lawrence Livermore National Security, LLC
    Inventor: John F. Holzrichter
  • Publication number: 20120053948
    Abstract: The invention relates to compressing of sparse data sets contains sequences of data values and position information therefor. The position information may be in the form of position indices defining active positions of the data values in a sparse vector of length N. The position information is encoded into the data values by adjusting one or more of the data values within a pre-defined tolerance range, so that a pre-defined mapping function of the data values and their positions is close to a target value. In one embodiment, the mapping function is defined using a sub-set of N filler values which elements are used to fill empty positions in the input sparse data vector. At the decoder, the correct data positions are identified by searching though possible sub-sets of filler values.
    Type: Application
    Filed: August 24, 2011
    Publication date: March 1, 2012
    Inventors: Frederic Mustiere, Hossein Najaf-Zadeh, Ramin Pishehvar, Hassan Lahdili, Louis Thibault, Martin Bouchard
  • Publication number: 20120053949
    Abstract: There is provided a coding technique capable of reducing the amount of computation in coding while maintaining the efficiency of the coding. The technique uses an input signal and one of a decoded signal decoded from a first code obtained by encoding the input signal and a decoded signal obtained during generation of the first code. A gain group set includes one or more gain groups including different numbers of values corresponding to gains. A gain group is allocated to each sample by using a predetermined method. The sample is multiplied by a gain identified by a value corresponding to each gain in the allocated gain group and a gain code indicating a gain that results in the smallest difference between the product and the input signal is output.
    Type: Application
    Filed: May 28, 2010
    Publication date: March 1, 2012
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORP.
    Inventors: Shigeaki Sasaki, Kimitaka Tsutsumi, Masahiro Fukui, Yusuke Hiwasaki
  • Publication number: 20120053950
    Abstract: Disclosed are an encoding device, a decoding device, and methods therein which eliminate at an early stage the loss of synchronization of the adaptive filters of a terminal at the encoding end and a terminal at the decoding end caused by transmission errors such as packet losses, and suppress deterioration of the sound quality when a multiple channel signal is encoded with high efficiency using an adaptive filter.
    Type: Application
    Filed: May 21, 2010
    Publication date: March 1, 2012
    Applicant: PANASONIC CORPORATION
    Inventor: Masahiro Oshikiri
  • Publication number: 20120053932
    Abstract: A method for automatic transmission of status information from a first communications terminal set up for speech communication to a second communications terminal set up for text communication is provided. The speech communication between communications terminals is processed over a speech communications server and the text communication between communications terminals over a text communications server. The speech communications server and the text communications server exchange messages over at least one converter device. The status information will be transmitted from the first communications terminal over the speech communications server, the converter device, and the text communications server to the second communications terminal.
    Type: Application
    Filed: August 8, 2011
    Publication date: March 1, 2012
    Inventor: Claus Rist
  • Publication number: 20120046956
    Abstract: Storing audio data encoded in any of a plurality of different audio encoding formats is enabled by parametrically defining the underlying format in which the audio data is encoded, in audio format and packet table chunks. A flag can be used to manage storage of the size of the audio data portion of the file, such that premature termination of an audio recording session does not result in an unreadable corrupted file. This capability can be enabled by initially setting the flag to a value that does not correspond to a valid audio data size and that indicates that the last chunk in the file contains the audio data. State information for the audio data, to effectively denote a version of the file, and a dependency indicator for dependent metadata, may be maintained, where the dependency indicator indicates the state of the audio data on which the metadata is dependent.
    Type: Application
    Filed: October 31, 2011
    Publication date: February 23, 2012
    Applicant: APPLE INC.
    Inventors: William G. Stewart, James E. McCartney, Douglas S. Wyatt
  • Publication number: 20120046955
    Abstract: A scheme for injecting noise at uncoded elements of a spectrum is controlled according to a measure of a distribution of energy of the original spectrum among the locations of the uncoded elements.
    Type: Application
    Filed: August 16, 2011
    Publication date: February 23, 2012
    Applicant: QUALCOMM Incorporated
    Inventors: Vivek Rajendran, Ethan Robert Duni, Venkatesh Krishnan
  • Publication number: 20120046942
    Abstract: A terminal and method to determine surrounding circumstances using received sound signals and to automatically control various user interfaces according to the surrounding circumstances. The terminal divides the received sound signals into voice and non-voice signals, analyzes the divide sound signals based on frequencies and determines the circumstances based on the analyzed sound signals. The terminal may further control a user interface based on the determined surrounding circumstances.
    Type: Application
    Filed: August 2, 2011
    Publication date: February 23, 2012
    Applicant: PANTECH CO., LTD.
    Inventors: Moonsup LEE, Sungjin KIM, Seokgi HONG, Taehun KIM, Yunseop GEUM, Pilwoo LEE, Dusin JANG
  • Publication number: 20120041760
    Abstract: In a voice recording equipment and method, voice data from a speaker is received using a microphone. Threshold values T1 and T2 of surrounding environment of the voice recording equipment are determined. If an intensity of the voice data is less than or equal to the threshold value T2, the voice recording is stopped and the speaker is informed that the voice data is not useful. If the intensity of the voice data is greater than the threshold values, the voice data is stored into a storage unit.
    Type: Application
    Filed: October 28, 2010
    Publication date: February 16, 2012
    Applicants: HON HAI PRECISION INDUSTRY CO., LTD., AMBIT MICROSYSTEMS (SHANGHAI) LTD.
    Inventors: HONG KANG, GUO-ZHI DING, CHI-MING LU
  • Publication number: 20120041761
    Abstract: Disclosed is a voice decoding apparatus wherein the processor may be continuously employed for other applications for a prescribed time but, in response to an urgent interrupt, the processor can generate synthesised sound even when being used for other applications, without interruption. In this apparatus, a packet receiving section (101) receives packets of the layers of a plurality of frames and extracts code from the received packets. A state/code storage section (103) stores the code and decoding state of the code. A layer selection section (104) selects a layer number and a frame number corresponding to the code to be initially decoded, based on the decoding state. A decoding section (105) decodes the code of the selected frame number and layer number.
    Type: Application
    Filed: March 12, 2010
    Publication date: February 16, 2012
    Applicant: PANASONIC CORPORATION
    Inventors: Toshiyuki Morii, Hiroyuki Ehara
  • Publication number: 20120039397
    Abstract: A digital signal reproduction device includes an audio decoder configured to decode an audio bit stream to output a resulting audio signal, an audio bit stream analyzer configured to analyze whether or not the audio bit stream contains human voice, a playback speed determiner configured to determine a playback speed based on a result of the analysis by the audio bit stream analyzer, and a variable speed reproducer configured to receive the audio signal and reproduce an audio signal corresponding to the playback speed determined by the playback speed determiner.
    Type: Application
    Filed: October 25, 2011
    Publication date: February 16, 2012
    Applicant: PANASONIC CORPORATION
    Inventors: Hiroshi IKEDA, Shuji Miyasaka
  • Publication number: 20120041759
    Abstract: A mobile replacement-dialogue recording system enables the creation of replacement-dialogue items by mobile users not at a media recording studio. Studio-users prepare guide media video, audio and text data which are made available to mobile users through a media server. A mobile user's mobile replacement-dialogue recording device obtains guide media and allows the user to view the guide media in rehearsal mode. The mobile replacement-dialogue recording device then records the mobile user's dialogue performance while presenting the mobile user with synchronized guide media. The mobile user can review, delete, and rerecord the resulting potential replacement dialogue, as well as create feedback media characterizing the replacement dialogue. Selected replacement dialogue items can be transmitted to the media server. A studio-module can then obtain the selected replacement dialogue items and feedback media from the media server so that they may be used in media-replacement.
    Type: Application
    Filed: September 3, 2010
    Publication date: February 16, 2012
    Applicant: BOARDWALK TECHNOLOGY GROUP, LLC
    Inventors: SEAN C. BARKER, GARY A. RANDALL, TIMOTHY SCOTT BOGART
  • Publication number: 20120035918
    Abstract: In a method of providing a backward and forward compatible speech codec payload format, the following steps are included: providing S10 an RTP package; including S20 payload according to a first codec into the provided RTP package, and appending S50 payload according to a second codec into the provided RTP package. In addition, at least one unused bit is located S30 in the included first codec payload, and the located at least one unused bit is designated S40 as a codec compatibility bit. Finally, the designated at least one codec compatibility bit is utilized S60 to provide an indication of the presence of the appended second codec payload.
    Type: Application
    Filed: April 7, 2010
    Publication date: February 9, 2012
    Applicant: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Tomas Frankkila, Stefan Bruhn, Danlel Enstrom
  • Publication number: 20120035938
    Abstract: An audio reproducing method for quickly and correctly extracting extra data, including: receiving a data stream including the extra data including an end marker disposed immediately before main data and data length information, which is length information of the extra data, disposed immediately before the end marker; checking the presence/absence of the end marker; and if the end marker exists, extracting the extra data by using the data length information.
    Type: Application
    Filed: August 5, 2011
    Publication date: February 9, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jong-hoon JEONG, Chul-woo LEE, Nam-suk LEE, Sang-hoon LEE
  • Publication number: 20120035937
    Abstract: A method and apparatus for generating synthesis audio signals are provided. The method includes decoding a bitstream; splitting the decoded bitstream into n sub-band signals; generating n transformed sub-band signals by transforming the n sub-band signals in a frequency domain; and generating synthesis audio signals by respectively multiplying the n transformed sub-band signals by values corresponding to synthesis filter bank coefficients.
    Type: Application
    Filed: August 5, 2011
    Publication date: February 9, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Hyun-wook KIM, Han-gil MOON, Sang-hoon LEE
  • Publication number: 20120035940
    Abstract: An audio signal processing method includes: receiving an audio signal comprising consecutive frames; generating a first encoding parameter corresponding to a first frame among the consecutive frames and a second encoding parameter corresponding to a second frame adjacent to the first frame; and generating at least one interpolated parameter based on the first encoding parameter and the second encoding parameter.
    Type: Application
    Filed: August 5, 2011
    Publication date: February 9, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jong-hoon JEONG, Nam-suk LEE, Han-gil MOON
  • Publication number: 20120035917
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for detecting and correcting abnormal stress patterns in unit-selection speech synthesis. A system practicing the method detects incorrect stress patterns in selected acoustic units representing speech to be synthesized, and corrects the incorrect stress patterns in the selected acoustic units to yield corrected stress patterns. The system can further synthesize speech based on the corrected stress patterns. In one aspect, the system also classifies the incorrect stress patterns using a machine learning algorithm such as a classification and regression tree, adaptive boosting, support vector machine, and maximum entropy. In this way a text-to-speech unit selection speech synthesizer can produce more natural sounding speech with suitable stress patterns regardless of the stress of units in a unit selection database.
    Type: Application
    Filed: August 6, 2010
    Publication date: February 9, 2012
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Yeon-Jun KIM, Mark Charles BEUTNAGEL, Alistair D. CONKIE, Ann K. SYRDAL
  • Publication number: 20120035941
    Abstract: An audio encoder and decoder use architectures and techniques that improve the efficiency of quantization (e.g., weighting) and inverse quantization (e.g., inverse weighting) in audio coding and decoding. The described strategies include various techniques and tools, which can be used in combination or independently. For example, an audio encoder quantizes audio data in multiple channels, applying multiple channel-specific quantizer step modifiers, which give the encoder more control over balancing reconstruction quality between channels. The encoder also applies multiple quantization matrices and varies the resolution of the quantization matrices, which allows the encoder to use more resolution if overall quality is good and use less resolution if overall quality is poor. Finally, the encoder compresses one or more quantization matrices using temporal prediction to reduce the bitrate associated with the quantization matrices. An audio decoder performs corresponding inverse processing and decoding.
    Type: Application
    Filed: October 18, 2011
    Publication date: February 9, 2012
    Applicant: Microsoft Corporation
    Inventors: Naveen Thumpudi, Wei-Ge Chen
  • Publication number: 20120027186
    Abstract: In one embodiment, a method, system and apparatus for recording audio is provided so that the recording can be authenticated. The system may be implemented as a central server that is accessed via one or more lines for audio communication, or as a stand-alone unit. The system operates by encrypting communicated data (e.g., audio signals), storing the encrypted information, and providing at least one user with a key that can be used to decrypt the stored information.
    Type: Application
    Filed: October 11, 2011
    Publication date: February 2, 2012
    Applicant: WALKER DIGITAL, LLC
    Inventors: Jay S. Walker, Thomas M. Sparico, James A. Jorasch
  • Publication number: 20120029913
    Abstract: According to one embodiment, there is provided a sound quality control apparatus, including: a characteristic parameter extractor; a speech score calculator; a music score calculator; a power value acquisition module; a first storage configured to store speech scores and music scores; a second storage configured to store power values; a power-based score corrector configured to correct a current music score or a current speech score based on a first comparison result between a current power value and past power values, a second comparison result between the current music score and past music scores and a third comparison result between the current speech score and past speech scores; and a sound quality controller configured to perform a sound quality control by using at least one of the speech score and the music score corrected by the power-based score corrector.
    Type: Application
    Filed: April 28, 2011
    Publication date: February 2, 2012
    Inventors: Hirokazu TAKEUCHI, Hiroshi YONEKUBO
  • Publication number: 20120029926
    Abstract: A scheme for coding a set of transform coefficients that represent an audio-frequency range of a signal uses information from a reference frame that describes a previous frame of the signal to determine frequency-domain locations of regions of significant energy in a target frame of the signal.
    Type: Application
    Filed: July 28, 2011
    Publication date: February 2, 2012
    Applicant: QUALCOMM Incorporated
    Inventors: Venkatesh Krishnan, Vivek Rajendran, Ethan Robert Duni
  • Publication number: 20120029924
    Abstract: A multistage shape vector quantizer architecture uses information from a selected first-stage codebook vector to generate a rotation matrix. The rotation matrix is used to rotate the direction of the input vector to support shape quantization of the first-stage quantization error.
    Type: Application
    Filed: July 28, 2011
    Publication date: February 2, 2012
    Applicant: QUALCOMM Incorporated
    Inventors: Ethan Robert Duni, Venkatesh Krishnan, Vivek Rajendran
  • Publication number: 20120029925
    Abstract: A dynamic bit allocation operation determines a bit allocation for each of a plurality of vectors, based on a corresponding plurality of gain factors, and compares each allocation to a threshold value that is based on a dimensionality of the vector.
    Type: Application
    Filed: July 28, 2011
    Publication date: February 2, 2012
    Applicant: QUALCOMM Incorporated
    Inventors: Ethan Robert Duni, Venkatesh Krishnan, Vivek Rajendran
  • Publication number: 20120028642
    Abstract: Speech signals to be sent between a first node and a second node via a wireless communication system are Adaptive Multi-Rate (AMR) encoded. A need to change the first node's first data transmission rate over a radio interface to a second different data transmission rate is determined. A new AMR source bit rate is then determined for both nodes. Information is sent to the second node, in advance of changing the data transmission rate over the radio interface, requesting the second node to change towards the new AMR source bit rate. After a predetermined time period sufficient for the second node to change from the current AMR source bit rate to the new AMR source bit rate expires or after the second node indicates a change to the new AMR source bit rate, the first node starts transmitting at the second data transmission rate over the radio interface.
    Type: Application
    Filed: October 28, 2009
    Publication date: February 2, 2012
    Applicant: Telefonaktiebolaget LM
    Inventor: Paul SCHLIWA-BERTLING