For Storage Or Transmission Patents (Class 704/201)
  • Patent number: 8374856
    Abstract: A method and apparatus for concealing frame loss and an apparatus for transmitting and receiving a speech signal that are capable of reducing speech quality degradation caused by packet loss are provided. In the method, when loss of a current received frame occurs, a random excitation signal having the highest correlation with a periodic excitation signal (i.e., a pitch excitation signal) decoded from a previous frame received without loss is used as a noise excitation signal to recover an excitation signal of a current lost frame. Furthermore, a third, new attenuation constant (AS) is obtained by summing a first attenuation constant (NS) obtained based on the number of continuously lost frames and a second attenuation constant (PS) predicted in consideration of change in amplitude of previously received frames to adjust the amplitude of the recovered excitation signal for the current lost frame.
    Type: Grant
    Filed: January 9, 2009
    Date of Patent: February 12, 2013
    Assignee: Intellectual Discovery Co., Ltd.
    Inventors: Hong Kook Kim, Choong Sang Cho
  • Patent number: 8374853
    Abstract: A system for coding a hierarchical audio signal, comprising, at least, a core layer using parametric coding by analysis by synthesis in a first frequency band, a band extension layer for widening said first frequency band into a second frequency band, or wideband. The system also comprises a wideband audio coding quality enhancement layer based on transform coding using a spectral parameter obtained from said band extension layer. Application to transmitting speech and/or audio signals over packet networks.
    Type: Grant
    Filed: July 7, 2006
    Date of Patent: February 12, 2013
    Assignee: France Telecom
    Inventors: Stéphane Ragot, David Virette
  • Patent number: 8370132
    Abstract: Apparatus and methods are provided for measuring perceptual quality of a signal transmitted over a communication network, such as a circuit-switching network, packet-switching network, or a combination thereof. In accordance with one embodiment, a distributed apparatus is provided for measuring perceptual quality of a signal transmitted over a communication network. The distributed apparatus includes communication ports located at various locations in the network. The distributed apparatus may also include a signal processor including a processor for providing non-intrusive measurement of the perceptual quality of the signal. The distributed apparatus may further include recorders operatively connected to the communication ports and to the signal processor, wherein at least one of the recorders processes the signal at one of the communication ports and the recorder sends the signal to the signal processor to measure the perceptual quality of the signal.
    Type: Grant
    Filed: November 21, 2005
    Date of Patent: February 5, 2013
    Assignee: Verizon Services Corp.
    Inventor: Adrian E. Conway
  • Patent number: 8370133
    Abstract: A method for perceptual spectral decoding comprises decoding of spectral coefficients recovered from a binary flux into decoded spectral coefficients of an initial set of spectral coefficients. The initial set of spectral coefficients are spectrum filled. The spectrum filling comprises noise filling of spectral holes by setting spectral coefficients in the initial set of spectral coefficients not being decoded from the binary flux equal to elements derived from the decoded spectral coefficients. The set of reconstructed spectral coefficients of a frequency domain formed by the spectrum filling is converted into an audio signal of a time domain. A perceptual spectral decoder comprises a noise filler, operating according to the method for perceptual spectral decoding.
    Type: Grant
    Filed: August 26, 2008
    Date of Patent: February 5, 2013
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventors: Anisse Taleb, Manuel Briand, Gustaf Ullberg
  • Patent number: 8364477
    Abstract: A method (400, 500) and apparatus (220) seeks to improve the intelligibility of speech emitted into a noisy environment. Formants are identified (426) and perceptual frequency scale band is selected (502) that includes at least one of the identified formants. The SNR in each band is compared (504) to a threshold and, if the SNR for that band is less than the threshold, the method increases a formant enhancement gain for that band. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains yielding combined gains that are then clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532), and used to reconstruct (532, 534) an audio signal.
    Type: Grant
    Filed: August 30, 2012
    Date of Patent: January 29, 2013
    Assignee: Motorola Mobility LLC
    Inventors: Jianming J Song, John C Johnson
  • Patent number: 8364476
    Abstract: The invention pertains to a method and apparatus of efficient encoding and decoding of vector quantized data. The method and system explores and implements sub-division of a quantization vector space comprising class-leader vectors and representation of the class-leader vectors by a set of class-leader root-vectors facilitating faster encoding and decoding, and reduced storage requirements.
    Type: Grant
    Filed: October 15, 2010
    Date of Patent: January 29, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Peter Vary, Hauke Kruger, Bernd Geiser
  • Patent number: 8364480
    Abstract: A method and corresponding apparatus for coded-domain acoustic echo control is presented. An echo control problem is considered as that of perceptually matching an echo signal to a reference signal. A perceptual similarity function that is based on the coded spectral parameters produced by the speech codec is defined. Since codecs introduce a significant degree of non-linearity into the echo signal, the similarity function is designed to be robust against such effects. The similarity function is incorporated into a coded-domain echo control system that also includes spectrally-matched noise injection for replacing echo frames with comfort noise. Using actual echoes recorded over a commercial mobile network, it is shown herein that the similarity function is robust against both codec non-linearities and additive noise. Experimental results further show that the echo-control is effective at suppressing echoes compared to a Normalized Least Mean Squared (NLMS)-based echo cancellation system.
    Type: Grant
    Filed: September 9, 2011
    Date of Patent: January 29, 2013
    Assignee: Tellabs Operations, Inc.
    Inventor: Rafid A. Sukkar
  • Patent number: 8364495
    Abstract: An encoding device capable of realizing a scalable CODEC of a high performance. In this encoding device, an LPC analyzing unit (551) analyzes an input voice (301) efficiently with a synthesized LPC parameter obtained from a core decoder (305), to acquire an encoded LPC coefficient. An adaptive code note (552) is stored with its sound source codes, as acquired from the core decoder (305). The adaptive code note (552) and a stochastic code note (553) send sound source samples to a gain adjusting unit (554). This gain adjusting unit (554) multiplies the individual sound source samples by an amplification based on the gain parameters acquired from the core decoder (305), and then adds the products to acquire sound source vectors. These vectors are sent to an LPC synthesizing unit (555). This LPC synthesizing unit (555) filters the sound source vectors acquired at the gain adjusting unit (554), with the LPC parameter, to acquire a synthetic signal.
    Type: Grant
    Filed: September 1, 2005
    Date of Patent: January 29, 2013
    Assignee: Panasonic Corporation
    Inventor: Toshiyuki Morii
  • Patent number: 8363809
    Abstract: A teleconference terminal apparatus including: an input unit which receives a speech signal; an analyzing unit which calculates a target size on a predetermined segment basis of a speech signal; a coding unit which codes the speech signal to generate a data stream, so that the coded data size on a predetermined segment basis becomes the target size corresponding to each predetermined segment; a stream transmitting unit which transmits to a network the data stream; a receiving unit which receives the data stream transmitted from another terminal apparatus; a filtering unit which determines whether segment data is to be decoded based on data size for each predetermined segment in the received data stream, the segment data being included in the data stream; a decoding unit which decodes segment data determined to be decoded to generate a speech signal; and an output unit which outputs the generated speech signal.
    Type: Grant
    Filed: October 24, 2008
    Date of Patent: January 29, 2013
    Assignee: Panasonic Corporation
    Inventor: Kojiro Ono
  • Publication number: 20130024192
    Abstract: Disclosed is an information display system provided with: a signal analyzing unit which analyzes the audio signals obtained from a predetermined location and which generates ambient sound information regarding the sound generated at the predetermined location; and an ambient expression selection unit which selects an ambient expression which expresses the content of what a person is feeling from the sound generated at the predetermined location on the basis of the ambient sound information.
    Type: Application
    Filed: March 28, 2011
    Publication date: January 24, 2013
    Applicant: NEC CORPORATION
    Inventors: Toshiyuki Nomura, Yuzo Senda, Kyota Higa, Takayuki Arakawa, Yasuyuki Mitsui
  • Publication number: 20130024188
    Abstract: A system for encoding an audio signal includes an audio console configured to receive a voice audio signal contained within a first audio spectrum, encode the voice audio signal with a background audio signal contained within a second audio spectrum wider than the first audio spectrum, encode the voice audio signal with a monitoring code and output a combined signal including the voice audio signal encoded with the background audio signal and the monitoring code. The combined signal is contained within an audio spectrum including the first audio spectrum and the second audio spectrum.
    Type: Application
    Filed: July 21, 2011
    Publication date: January 24, 2013
    Inventor: Lee S. Weinblatt
  • Publication number: 20130024189
    Abstract: A mobile terminal and a control method thereof are provided. The mobile terminal includes: an audio output module; a memory storing text; and a controller configured to convert at least a portion of the text into a speech and output the speech through the audio output module, wherein the controller stores at least a portion of speech data obtained by converting the at least a portion of the text into the speech in the memory, and outputs the speech based on the stored speech data to the audio output module when a speech output signal with respect to the at least portion of the text is obtained. When speech output signal with respect to a portion which has been output by speech is obtained, speech is output based on the stored speech data, thereby shortening time required for outputting the speech.
    Type: Application
    Filed: September 22, 2011
    Publication date: January 24, 2013
    Applicant: LG ELECTRONICS INC.
    Inventors: Jaemin KIM, Seungho HAN, Yongchul PARK
  • Publication number: 20130024187
    Abstract: A system that incorporates teachings of the present disclosure may include, for example, transmitting a request to initiate a communication session with a member device of a social network, activating a speech capture element, maintaining activation of the speech capture element in accordance with a pattern of prior speech messages, detecting a speech message at the activated speech capture element, and transmitting the detected speech message, or a derivative thereof, to the member device of the social network. Other embodiments are disclosed.
    Type: Application
    Filed: July 18, 2011
    Publication date: January 24, 2013
    Applicant: AT&T Intellectual Property I, LP
    Inventors: HISAO CHANG, David Mornhineway
  • Publication number: 20130018654
    Abstract: In one embodiment, a method includes monitoring activity in an environment, and storing a snippet of the monitored activity. Monitoring the activity in the environment includes operating a device arranged to capture the activity between approximately a first time and approximately a second time. The snippet has a particular duration that is arranged to end at approximately the second time;. The method also includes storing the snippet in a storage module and determining when a request to provide the snippet is obtained from a party. If it is determined that the request to play the snippet is obtained, the method includes accessing the storage module to obtain the snippet and providing the snippet to the party if it is determined that the request to provide the snippet is obtained.
    Type: Application
    Filed: July 12, 2011
    Publication date: January 17, 2013
    Applicant: CISCO TECHNOLOGY, INC.
    Inventor: John A. Toebes
  • Patent number: 8355911
    Abstract: A device for lost frame concealment comprises: a lost frame detector for detecting whether a voice frame is lost, a decoding module for decoding the current voice frame, a low band delay module for delaying the low band signal, a low band signal recovering module for recovering the lost low band signal, a high band lost frame concealment module for processing the lost frame concealment for the high band signal, and a QMF synthesis filter for synthetically filtering the low band signal and the high band signal. The invention makes full use of the delay of the coding/decoding device itself, enhances the effect of lost frame concealment for the low band signal and the high band signal, and introduces no nearby delay during the process of lost frame concealment.
    Type: Grant
    Filed: December 14, 2009
    Date of Patent: January 15, 2013
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Wuzhou Zhan, Dongqi Wang
  • Patent number: 8355909
    Abstract: A system for controlling dynamic range of an audio signal comprises an automatic gain control element that receives an input signal having a varying level and outputs a control signal that varies based on the varying level of the input signal and a modified input signal having a dynamic range different than a dynamic range of the input signal. The system also comprises an inverter that inverts the control signal or a block-based control signal corresponding to the control signal in block format. The system also comprises a variable gain element that receives the modified input signal and at least some of the inverted control signal or block-based control signal. The variable gain element also outputs a remainder signal corresponding to the modified input signal as unmodified based on the at least some of the inverted control signal or block-based control signal.
    Type: Grant
    Filed: June 12, 2012
    Date of Patent: January 15, 2013
    Assignee: Audyne, Inc.
    Inventors: Timothy J Carroll, Leif Claesson
  • Patent number: 8355906
    Abstract: A bandwidth extension module, and an associated method and computer-readable medium, suitable for use in artificially extending the bandwidth of a lowband speech signal. The bandwidth extension module comprises a band-pass filter configured to produce a band-pass signal from the lowband speech signal; at least one carrier frequency modulator, each carrier frequency modulator configured to pitch-synchronously modulate the band-pass signal about a respective carrier frequency, the at least one carrier frequency modulator collectively producing a highband speech signal component; a synthesis filter configured to determine a highband speech signal based on the highband speech signal component; and a summation module configured to combine the lowband speech signal with the highband speech signal to obtain a bandwidth-extended speech signal.
    Type: Grant
    Filed: May 21, 2010
    Date of Patent: January 15, 2013
    Assignee: Apple Inc.
    Inventors: Peter Kabal, Rafi Rabipour, Yasheng Qian
  • Publication number: 20130013297
    Abstract: A message service method using speech recognition includes a message server recognizing a speech transmitted from a transmission terminal, generating and transmitting a recognition result of the speech and N-best results based on a confusion network to the transmission terminal; if a message is selected through the recognition result and the N-best results and an evaluation result according to accuracy of the message are decided, the transmission terminal transmitting the message and the evaluation result to a reception terminal; and the reception terminal displaying the message and the evaluation result.
    Type: Application
    Filed: July 5, 2012
    Publication date: January 10, 2013
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Hwa Jeon SONG, YunKeun Lee, Jeon Gue Park, Jong Jin Kim, Ki-Young Park, Hoon Chung, Hyung-Bae Jeon, Ho Young Jung, Euisok Chung, Jeom Ja Kang, Byung Ok Kang, Sang Kyu Park, Sung Joo Lee, Yoo Rhee Oh
  • Publication number: 20130013299
    Abstract: A system for developing, deploying and maintaining a voice application over a communications network to one or more recipients has a voice application server connected to a data network for storing and serving voice applications, a network communications server connected to the data network and to the communications network for routing the voice applications to their intended recipients, a computer station connected to the data network having control access to at least the voice application server, and a software application running on the computer station for creating applications and managing their states. The system is characterized in that a developer operating the software application from the computer station creates voice applications through object modeling and linking, stores them for deployment in the application server, and manages deployment and state of deployed applications including scheduled deployment and repeat deployments in terms of intended recipients.
    Type: Application
    Filed: September 14, 2012
    Publication date: January 10, 2013
    Applicant: Apptera, Inc.
    Inventors: Michael S. Yuen, Leo Chiu
  • Publication number: 20130013298
    Abstract: Techniques for generating, distributing, and using speech recognition models are described. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. Thus, the speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device's speech processing capability.
    Type: Application
    Filed: September 13, 2012
    Publication date: January 10, 2013
    Applicant: GOOGLE INC.
    Inventors: Craig Reding, Suzi Levas
  • Patent number: 8352280
    Abstract: An audio encoder adapted to encode a multi-channel audio signal. The encoder comprises an encoder combination module (ECM) for generating a dominant signal part and a residual signal part being a combined representation of first and second audio signals, the dominant and residual signal parts being obtained by applying a mathematical procedure to the first and second audio signals. The mathematical procedure involves a spatial parameter comprising a description of spatial properties of the first and second audio signals. Embodiments include a plurality of interconnected encoder combination module, so that e.g. six independent 5.1 format audio signals can be encoded to a single or two dominant signal parts and a number of parameter sets and residual signal parts.
    Type: Grant
    Filed: September 7, 2011
    Date of Patent: January 8, 2013
    Inventors: Francois Philippus Myburg, Erik Gosuinus Petrus Schuijers
  • Publication number: 20130006617
    Abstract: Techniques for implementing vocoders in parallel digital signal processors are described. A preferred approach is implemented in conjunction with the BOPS® Manifold Array (ManArray™) processing architecture so that in an array of N parallel processing elements, N channels of voice communication are processed in parallel. Techniques for forcing vocoder processing of one data-frame to take the same number of cycles are described. Improved throughput and lower clock rates can be achieved.
    Type: Application
    Filed: September 13, 2012
    Publication date: January 3, 2013
    Applicant: ALTERA CORPORATION
    Inventors: Ali Soheil Sadri, Navin Jaffer, Anissim A. Silivra, Bin Huang, Matthew Plonski
  • Publication number: 20130007043
    Abstract: Methods and systems for time-synchronous voice annotation of video and audio media enable effective searching of time-based media content. A user record one or more types voice annotation onto corresponding named voice annotation tracks, which are stored within a media object comprising the time-based media and the annotations. The one or more annotation tracks can then be selectively searched for content using speech or text search terms. Various workflows enable voice annotation to be performed using media editing systems, or one or more stand alone voice annotations systems that permit multiple annotators to operate in parallel, generating different kinds of annotations, and returning their annotation tracks to a central location for consolidation.
    Type: Application
    Filed: June 30, 2011
    Publication date: January 3, 2013
    Inventors: Michael E. Phillips, Paul J. Gray
  • Patent number: 8346547
    Abstract: An advanced audio coding (AAC) encoder quantization architecture is described. The architecture includes an efficient, low computation complexity approach for estimating scalefactors in which a base scalefactor estimate is adjusted by a delta scalefactor estimate that is based, in part, on global scalefactor adjustments applied to the previously quantized/encoded frame. Using such feedback, the AAC encoder quantization architecture is able to produce scalefactor estimates that are very close to the actual scalefactor applied by the subsequent quantization and encoding process. The architecture further includes a frequency hole avoidance approach that reduces a magnitude of an estimated scalefactor to avoid generating frequency holes in quantized SFBs.
    Type: Grant
    Filed: May 14, 2010
    Date of Patent: January 1, 2013
    Assignee: Marvell International Ltd.
    Inventor: Lijie Tang
  • Patent number: 8346239
    Abstract: Methods, systems, and computer program products for silence insertion descriptor (SID) conversion are disclosed. According to one aspect, the subject matter described herein includes a method for silence insertion descriptor (SID) conversion. The method includes receiving a wireless frame, the frame identifying a first node as a frame source and a second node as a frame destination; determining whether tandem-free operation (TFO) is applicable; responsive to a determination that TFO is applicable, determining whether the frame is a SID frame; responsive to a determination that the frame is a SID frame, determining whether the SID format used by the first node is incompatible with the SID format used by the second node; and responsive to a determination that the SID format used by the first node is incompatible with the SID format used by the second node, converting the SID frame from the SID format used by the first node to the SID format used by the second node.
    Type: Grant
    Filed: December 28, 2007
    Date of Patent: January 1, 2013
    Assignee: Genband US LLC
    Inventors: Yanhua Wang, Philip Abraham
  • Patent number: 8345884
    Abstract: A first matrix (W(k)) indicating frequency characteristics of a separation filter is calculated from input signals of a plurality of channels. A second matrix (Ws(k)) is calculated by using the restriction coefficients (Ci(k)) for restricting the separation filter and the first matrix, and separation filter coefficients (wsij(s)) are calculated by using the second matrix. With use of the separation filter coefficients, separation signals (ysi(t)) are then calculated from the input signals. A third matrix (Ws?1(k)) is then calculated by transforming the second matrix into an inverse matrix at each frequency, and reproduction filter coefficients (a?I1(s), a?I2(s)) are calculated by using the third matrix. With use of the reproduction filter coefficients, the synthesized signal of each channel is calculated by using the separation signals.
    Type: Grant
    Filed: December 7, 2007
    Date of Patent: January 1, 2013
    Assignee: NEC Corporation
    Inventor: Toshiyuki Nomura
  • Publication number: 20120330651
    Abstract: A voice data transferring device intermediates between an in-vehicle terminal and a voice recognition server. In order to check a change in voice recognition performance of the voice recognition server, the voice data transferring device performs a noise suppression processing on a voice data for evaluation in a noise suppression module; transmits the voice data for evaluation to the voice recognition server; and receives a recognition result thereof. The voice data transferring device sets a value of a noise suppression parameter used for a noise suppression processing or a value of a result integration parameter used for a processing of integrating a plurality of recognition results acquired from the voice recognition server, at an optimum value, based on the recognition result of the voice recognition server. This makes it possible to set a suitable parameter even if the voice recognition performance of the voice recognition server changes.
    Type: Application
    Filed: June 22, 2012
    Publication date: December 27, 2012
    Inventors: Yasunari Obuchi, Takeshi Homma
  • Patent number: 8340959
    Abstract: A method and an apparatus for transmitting a speech signal are provided. A speech signal transmitter includes a quadrature mirror filter, a base sub-band encoder, an enhancement sub-band encoder, and a network connector. The quadrature mirror filter receives a speech signal, divides the speech signal into an enhancement band speech signal and a base band speech signal, and outputs the enhancement band speech signal and the base band speech signal. The base sub-band encoder receives and encodes the base band speech signal. The enhancement sub-band encoder receives and encodes the enhancement band speech signal. The network connector multiplexes the encoded enhancement band speech signal and the encoded base band speech signal based on the kinds of networks over which speech signals are transmitted, and transmits the multiplexed signals to the networks. A speech signal is multiplexed and transmitted by various methods based on the kinds of networks. Thus, the speech signal can be efficiently transmitted.
    Type: Grant
    Filed: August 1, 2011
    Date of Patent: December 25, 2012
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Ho-sang Sung, Dae-hwan Hwang
  • Patent number: 8340964
    Abstract: The present invention relates to means and methods of classifying speech and music signals in voice communication systems, devices, telephones, and methods, and more specifically, to systems, devices, and methods that automate control when either speech or music is detected over communication links. The present invention provides a novel system and method for monitoring the audio signal, analyze selected audio signal components, compare the results of analysis with a pre-determined threshold value, and classify the audio signal either as speech or music.
    Type: Grant
    Filed: June 10, 2010
    Date of Patent: December 25, 2012
    Inventors: Alon Konchitsky, Alberto D Berstein, Sandeep Kulakcherla, William Martin Ribble, Kevin Fitzgerald, Don Seferovich
  • Patent number: 8340960
    Abstract: Techniques for implementing vocoders in parallel digital signal processors are described. A preferred approach is implemented in conjunction with the BOPS® Manifold Array (ManArray™) processing architecture so that in an array of N parallel processing elements, N channels of voice communication are processed in parallel. Techniques for forcing vocoder processing of one data-frame to take the same number of cycles are described. Improved throughput and lower clock rates can be achieved.
    Type: Grant
    Filed: June 16, 2009
    Date of Patent: December 25, 2012
    Assignee: Altera Corporation
    Inventors: Ali Soheil Sadri, Navin Jaffer, Anissim A. Silivra, Bin Huang, Matthew Plonski
  • Publication number: 20120323567
    Abstract: A speech coding method of significantly reducing error propagation due to voice packet loss, while still greatly profiting from a pitch prediction or Long-Term Prediction (LTP), is achieved by limiting or reducing a pitch gain only for the first subframe or the first two subframes within a speech frame. The method is used for a voiced speech class; a pitch cycle length is compared to a subframe size to decide to reduce the pitch gain for the first subframe or the first two subframes within the frame. Speech coding quality loss due to the pitch gain reduction is compensated by increasing a bit rate of a second excitation component or adding one more stage of excitation component only for the first subframe or the first two subframes within the speech frame.
    Type: Application
    Filed: July 31, 2011
    Publication date: December 20, 2012
    Inventor: Yang Gao
  • Publication number: 20120323568
    Abstract: Method and arrangement in a network node for adapting a property of source coding to the quality of a communication link in packet switched conversational services in a communication system. The method comprises obtaining (404) information related to the quality of a communication link. The method further comprises selecting (406) a source coding mode with an associated source coding delay, based on the obtained information and the associated source coding delay. The selected source coding mode is selected from a set of at least two source coding modes associated with different source coding delays, and is to be used when source coding voice data to be transmitted over the communication link.
    Type: Application
    Filed: March 2, 2010
    Publication date: December 20, 2012
    Applicant: Telefonaktiebolaget L M Ericsson (publ)
    Inventor: Stefan Bruhn
  • Patent number: 8335685
    Abstract: A speech enhancement system controls the gain of an excitation signal to prevent uncontrolled gain adjustments. The system includes a first device that converts sound waves into operational signals. An ambient noise estimator is linked to the first device and an echo canceller. The ambient noise estimator estimates how loud a background noise would be near the first device before or after an echo cancellation. The system then compares the ambient noise estimate to a current ambient noise estimate near the first device to control a gain of an excitation signal.
    Type: Grant
    Filed: May 22, 2009
    Date of Patent: December 18, 2012
    Assignee: QNX Software Systems Limited
    Inventor: Phillip A. Hetherington
  • Patent number: 8335299
    Abstract: The present invention relates to a system and method for capturing, sharing, annotating, archiving, and reviewing phone calls and related computer video output in a computer document format. The system creates a portable, transferable computer file recording of a phone call & computer video (“phone voice recording,” or “PVD”) that contains attached data to help identify, sort, and archive the file while maintaining the integrity of the file. Another aspect of the invention includes a method of using the system comprising initiating a phone call between two parties; beginning a recording of the call and initiating a PVD for the recording; and terminating the call and creating the PVD for the call. Another aspect of the invention includes a method of accessing a PVD by a user, reviewing and/or modifying the PVD, capturing the modified PVD, and sharing the PVD with another user.
    Type: Grant
    Filed: August 4, 2008
    Date of Patent: December 18, 2012
    Assignee: Computer Telephony Solutions, Inc.
    Inventors: Justin Crandall, Todd Lindberg, Skip Welch
  • Publication number: 20120316868
    Abstract: Methods and systems are described for changing a communication quality of a communication session based on a meaning of speech data. Speech data exchanged between clients participating in a communication session is parsed. A meaning of the parsed speech data is determined to determine a communication quality of the communication session. An action is performed to change the communication quality of the communication session based on the meaning of the parsed speech data.
    Type: Application
    Filed: August 27, 2012
    Publication date: December 13, 2012
    Inventor: Mona Singh
  • Patent number: 8332217
    Abstract: Methods of spectral partitioning which may be implemented in an encoder are described. The methods comprise determining an estimate of bit requirements for each of a plurality of spectral sub-bands. These estimates are then used to group the sub-bands into two or more regions by minimizing a cost function. This cost function is based on the estimates of bit requirements for each sub-band and the estimates may include estimates of code bit requirements and/or additional code bit requirements for each sub-band. These estimates may be determined in many different ways and a number of methods are described.
    Type: Grant
    Filed: September 9, 2008
    Date of Patent: December 11, 2012
    Assignee: Cambridge Silicon Radio Limited
    Inventors: David Hargreaves, Esfandiar Zavarehei
  • Publication number: 20120310634
    Abstract: A communication device includes memory, an input interface, a processing module, and a transmitter. The processing module receives a digital signal from the input interface, wherein the digital signal includes a desired digital signal component and an undesired digital signal component. The processing module identifies one of a plurality of codebooks based on the undesired digital signal component. The processing module then identifies a codebook entry from the one of the plurality of codebooks based on the desired digital signal component to produce a selected codebook entry. The processing module then generates a coded signal based on the selected codebook entry, wherein the coded signal includes a substantially unattenuated representation of the desired digital signal component and an attenuated representation of the undesired digital signal component. The transmitter converts the coded signal into an outbound signal in accordance with a signaling protocol and transmits it.
    Type: Application
    Filed: August 20, 2012
    Publication date: December 6, 2012
    Applicant: BROADCOM CORPORATION
    Inventor: Nambirajan Seshadri
  • Patent number: 8326607
    Abstract: The present invention relates to a method and arrangement for improving quality of a voice transmission by extracting filter coefficient parameters with respect to a voice signal in a first speech transmission rate, and using the extracted filter coefficient parameters in a second transmission rate that is equal or lower than the first transmission rate.
    Type: Grant
    Filed: January 11, 2010
    Date of Patent: December 4, 2012
    Assignee: Sony Ericsson Mobile Communications AB
    Inventor: Martin Nyström
  • Patent number: 8326617
    Abstract: A speech enhancement system enhances transitions between speech and non-speech segments. The system includes a background noise estimator that approximates the magnitude of a background noise of an input signal that includes a speech and a non-speech segment. A slave processor is programmed to perform the specialized task of modifying a spectral tilt of the input signal to match a plurality of expected spectral shapes selected by a Codec.
    Type: Grant
    Filed: May 22, 2009
    Date of Patent: December 4, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Phillip A. Hetherington, Shreyas Paranjpe, Xueman Li
  • Patent number: 8326619
    Abstract: Methods of encoding a signal using a perceptual model are described in which a signal to mask ratio parameter within the perceptual model is tuned. The signal to mask ratio parameter is tuned based on a function of the bitrate of the part of the signal which has already been encoded and the target bitrate for the encoding process. The tuned signal to 5 mask ratio parameter is used to compute a masking threshold for the signal which is then used to quantise the signal.
    Type: Grant
    Filed: September 9, 2008
    Date of Patent: December 4, 2012
    Assignee: Cambridge Silicon Radio Limited
    Inventors: Esfandiar Zavarehei, David Hargreaves
  • Patent number: 8326608
    Abstract: A method, a device, and a system for transcoding between two embedded codecs are disclosed. The method includes: delaying a first encoded stream in input streams for integer number of frames, where the first encoded stream includes a stream of at least one extension layer in the input streams obtained after input signals are encoded by using a first codec; and using the first codec to decode other encoded streams in the input streams to obtain the first decoded signal; and performing delay aligning and adjusting to obtain an adjusted signal so as to reduce the transcoding complexity and enhance quality of the transcoded signals.
    Type: Grant
    Filed: January 26, 2012
    Date of Patent: December 4, 2012
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Chen Hu, Lei Miao, Zexin Liu, Longyin Chen, Herve Marcel Taddei, Qing Zhang
  • Publication number: 20120303361
    Abstract: Methods and systems for sculpting synthesized speech using a graphic user interface are disclosed. An operator enters a stream of text that is used to produce a stream of target phonetic-units. The stream of target phonetic-units is then submitted to a unit-selection process to produce a stream of selected phonetic-units, each selected phonetic-unit derived from a database of sample phonetic-units. After the stream of sample phonetic-units is selected, an operator can remove various selected phonetic-units from the stream of selected phonetic-units, prune the sample phonetic-database and edit various cost functions using the graphic user interface. The edited speech information can then be submitted to the unit-selection process to produce a second stream of selected phonetic-units.
    Type: Application
    Filed: June 29, 2012
    Publication date: November 29, 2012
    Applicant: RHETORICAL SYSTEMS LIMITED
    Inventors: Peter Rutten, Paul Alexander Taylor
  • Publication number: 20120303360
    Abstract: Techniques are disclosed for using the hardware and/or software of the mobile device to obscure speech in the audio data before a context determination is made by a context awareness application using the audio data. In particular, a subset of a continuous audio stream is captured such that speech (words, phrases and sentences) cannot be reliably reconstructed from the gathered audio. The subset is analyzed for audio characteristics, and a determination can be made regarding the ambient environment.
    Type: Application
    Filed: August 19, 2011
    Publication date: November 29, 2012
    Applicant: Qualcomm Incorporated
    Inventors: Leonard H. Grokop, Vidya Narayanan, James W. Dolter, Sanjiv Nanda
  • Patent number: 8321210
    Abstract: An apparatus for encoding includes a first domain converter, a switchable bypass, a second domain converter, a first processor and a second processor to obtain an encoded audio signal having different signal portions represented by coded data in different domains, which have been coded by different coding algorithms. Corresponding decoding stages in the decoder together with a bypass for bypassing a domain converter allow the generation of a decoded audio signal with high quality and low bit rate.
    Type: Grant
    Filed: January 14, 2011
    Date of Patent: November 27, 2012
    Assignees: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V., Voiceage Corporation
    Inventors: Bernhard Grill, Stefan Bayer, Guillaume Fuchs, Stefan Geyersberger, Ralf Geiger, Johannes Hilpert, Ulrich Kraemer, Jeremie Lecomte, Markus Multrus, Max Neuendorf, Harald Popp, Nikolaus Rettelbach, Roch Lefebvre, Bruno Bessette, Jimmy Lapierre, Philippe Gournay, Redwan Salami
  • Patent number: 8315853
    Abstract: A post-filtering apparatus and method for speech enhancement in a modified discrete cosine transform (MDCT) domain are disclosed. In the apparatus and method, previous and current MDCT coefficients are used for obtaining a speech spectrum coefficient similar to a real speech spectrum, and a convex function is used for transforming the speech spectrum coefficient and obtaining a post-filter coefficient so that difference can increase in the case where the speech spectrum coefficient is small but decrease in the case where the coefficient is large. Then, the post-filter coefficient is applied to the MDCT coefficient. With this configuration, both the current and previous MDCT values are used, so that it is possible to obtain a spectrum coefficient similar to the real speech spectrum and to obtain a more accurate filter coefficient. Further, the coefficient is adaptively transformed through the convex function, thereby enhancing speech quality.
    Type: Grant
    Filed: June 5, 2008
    Date of Patent: November 20, 2012
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Hyun-woo Kim, Jong-mo Sung, Mi-suk Lee, Do-young Kim, Byung-sun Lee
  • Patent number: 8311810
    Abstract: The delay in a multi-channel audio coding apparatus and a multi-channel audio decoding apparatus is reduced. The audio coding apparatus includes: a downmix signal generating unit that generates, in a time domain, a first downmix signal that is one of a 1-channel audio signal and a 2-channel audio signal from an input multi-channel audio signal; a downmix signal coding unit that codes the first downmix signal; a first t-f converting unit that converts the input multi-channel audio signal into a multi-channel audio signal in a frequency domain; and a spatial information calculating unit that generates spatial information for generating a multi-channel audio signal from a downmix signal.
    Type: Grant
    Filed: July 28, 2009
    Date of Patent: November 13, 2012
    Assignee: Panasonic Corporation
    Inventors: Tomokazu Ishikawa, Takeshi Norimatsu, Kok Seng Chong, Huan Zhou
  • Patent number: 8305913
    Abstract: An apparatus (1240), method, and computer program to assess VoIP speech quality (130) using access to degraded signals is provided. Different types of impairment (110) have different effect, on speech quality. Preferred embodiments address up to four different types of impairment that affect VoIP signal quality: packet loss (230), speech clipping in time (850), noise (1400) and echo. An overall assessment algorithm factors in degradation due to various impairment factors to generate an overall speech quality assessment score or value.
    Type: Grant
    Filed: June 15, 2006
    Date of Patent: November 6, 2012
    Assignee: Nortel Networks Limited
    Inventors: Mohamed El-Hennawey, Rafik Goubran, Ayman M. Radwan, Lijing Ding
  • Publication number: 20120278066
    Abstract: A communication interface apparatus for a system and a plurality of users is provided. The communication interface apparatus for the system and the plurality of users includes a first process unit configured to receive voice information and face information from at least one user, and determine whether the received voice information is voice information of at least one registered user based on user models corresponding to the respective received voice information and face information; a second process unit configured to receive the face information, and determine whether the at least one user's attention is on the system based on the received face information; and a third process unit configured to receive the voice information, analyze the received voice information, and determine whether the received voice information is substantially meaningful to the system based on a dialog model that represents conversation flow on a situation basis.
    Type: Application
    Filed: November 9, 2010
    Publication date: November 1, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Nam-Hoon Kim, Chi-Youn Park, Jeong-Mi Cho, Jeong-su Kim
  • Patent number: 8301441
    Abstract: A method of encoding one or more parent blocks of values, the number of values being the length of each block, the method comprising for each parent block: (a) determining a first sum of values in the parent block; (b) splitting the parent block into smaller subblocks; (c) for at least one of the subblocks, determining a second sum of the values in the subblock, selecting a likelihood table from the plurality of likelihood tables based on said first sum of values in the parent block and encoding the second sum using the likelihood table; (d) designating each subblock a parent block; (e) carrying out steps (a), (b), (c) and (d) until at least one parent block reaches a predetermined condition.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: October 30, 2012
    Assignee: Skype
    Inventor: Koen Bernard Vos
  • Publication number: 20120269332
    Abstract: A method is provided for encoding multiple microphone signals into a composite source-separable audio (SSA) signal, conducive for transmission over a voice network. The embodiments enable the processing of source separation of the target voice signal from its ambient sound to be performed at any point in the voice communication network, including the internet cloud. A multiplicity of processing is possible over the SSA signal, based on the intended voice application. The level of processing is adapted with the availability of the processing power at the chosen processing node in the network in one embodiment. An apparatus for separating out the target source voice from its ambient sound is also provided. The apparatus includes a directed source separation (DSS) unit, which processes the two virtual microphone signals in the SSA representation, to generate a new SSA signal including the enhanced target voice and the enhanced ambient noise.
    Type: Application
    Filed: April 20, 2012
    Publication date: October 25, 2012
    Inventor: Shridhar K. Mukund