For Storage Or Transmission Patents (Class 704/201)
  • Patent number: 8296138
    Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.
    Type: Grant
    Filed: November 22, 2011
    Date of Patent: October 23, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Sarangarajan Parthasarathy, Richard Cameron Rose
  • Patent number: 8296132
    Abstract: The disclosure provides a method for noise generation, including: determining an initial value of a reconstructed parameter; determining a random value range based on the initial value of the reconstructed parameter; taking a value in the random value range randomly as a reconstructed noise parameter; and generating noise by using the reconstructed noise parameter. The disclosure also provides an apparatus for noise generation.
    Type: Grant
    Filed: March 26, 2010
    Date of Patent: October 23, 2012
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Deming Zhang, Jinliang Dai
  • Publication number: 20120265523
    Abstract: An audio coding terminal and method is provided. The terminal includes a coding mode setting unit to set an operation mode, from plural operation modes, for input audio coding by a codec, configured to code the input audio based on the set operation mode such that when the set operation mode is a high frame erasure rate (FER) mode the codec codes a current frame of the input audio according to a select frame erasure concealment (FEC) mode of one or more FEC modes. Upon the setting of the operation mode to be the High FER mode, the one FEC mode is selected, from the one or more FEC modes predetermined for the High FER mode, to control the codec by incorporating of redundancy within a coding of the input audio or as separate redundancy information separate from the coded input audio according to the selected one FEC mode.
    Type: Application
    Filed: April 10, 2012
    Publication date: October 18, 2012
    Applicant: Samsung Electronics Co., LTD.
    Inventors: Steven Craig GREER, Hosang Sung
  • Publication number: 20120265522
    Abstract: Methods and apparatus for coordinating audio data processing and network communication processing in a communication device by using time scaling for either inbound or outbound audio data processing, or both, in an communication device. In particular, time scaling of audio data is used to adapt timing for audio data processing to timing for modem processing, by dynamically adjusting a collection of audio samples to fit the container size required by the modem. Speech quality can be preserved while recovering and/or maintaining correct synchronizing between audio processing and communication processing circuits. In an example method, it is determined that a completion time for processing a first audio data frame falls outside a pre-determined timing window. Responsive to this determination, a subsequent audio data frame is time-scaled to control the completion time for processing the subsequent audio data frame.
    Type: Application
    Filed: April 15, 2011
    Publication date: October 18, 2012
    Inventors: Jan Fex, Béla Rathonyi, Jonas Lundbäck
  • Publication number: 20120259623
    Abstract: A system and method of operating an automatic speech recognition application over an Internet Protocol network is disclosed. The ASR application communicates over a packet network such as an Internet Protocol network or a wireless network. A grammar for recognizing received speech from a user over the IP network is selected from a plurality of grammars according to a user-selected application. A server receives information representing speech over the IP network, performs speech recognition using the selected grammar, and returns information based upon the recognized speech. Sub-grammars may be included within the grammar to recognize speech from sub-portions of a dialog with the user.
    Type: Application
    Filed: June 19, 2012
    Publication date: October 11, 2012
    Applicant: AT&T Intellectual Properties II, L.P.
    Inventors: Pamela Leigh Dragosh, David Bjorn Roe, Robert Douglas Sharp
  • Publication number: 20120259622
    Abstract: Disclosed is an audio encoding device which removes unnecessary inter-channel parameters from the subject to be encoded, improving the encoding efficiency thereby. In this audio encoding device, a principal component analysis unit (301) converts an inputted left signal {Lsb(f)} and an inputted right signal {Rsb(f)} into a principal component signal {PCsb(f)} and an ambient signal {Asb(f)} and calculates for each sub-band, a rotation angle which indicates the degree of conversion; a monophonic encoding unit (303) encodes the principal component signal {Pcsb(f)); a rotation angle encoding unit (302) encodes the angle of rotation {?b}; a local monophonic decoding unit (603) creates a decoded principal component signal; and a redundant parameter elimination unit (604) identifies the redundant parameters by analyzing the encoding quality of the decoded principal component signal and eliminates the redundant parameters from the signal to be encoded.
    Type: Application
    Filed: December 27, 2010
    Publication date: October 11, 2012
    Applicant: PANASONIC CORPORATION
    Inventors: Zongxian Liu, Kok Seng Chong
  • Patent number: 8285543
    Abstract: An audio signal is conveyed more efficiently by transmitting or recording a baseband of the signal with an estimated spectral envelope and a noise-blending parameter derived from a measure of the signal's noise-like quality. The signal is reconstructed by translating spectral components of the baseband signal to frequencies outside the baseband, adjusting phase of the regenerated components to maintain phase coherency, adjusting spectral shape according to the estimated spectral envelope, and adding noise according to the noise-blending parameter. Preferably, the transmitted or recorded signal also includes an estimated temporal envelope that is used to adjust the temporal shape of the reconstructed signal.
    Type: Grant
    Filed: January 24, 2012
    Date of Patent: October 9, 2012
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Michael Mead Truman, Mark Stuart Vinton
  • Patent number: 8285544
    Abstract: The invention relates to a method for generating a vector quantization dictionary for a signal, of the type comprising a statistical analysis of driving vectors representing the signal determining a finite set of code vectors which are proxies for the said driving vectors, wherein the method also comprises modifying the said finite set of code vectors to impose a minimum distance between the modified code vectors two by two, these modified code vectors forming the said dictionary.
    Type: Grant
    Filed: March 9, 2007
    Date of Patent: October 9, 2012
    Assignee: France Telecom
    Inventors: Stéphane Ragot, Cyril Guillaume
  • Publication number: 20120250913
    Abstract: Electronic devices and accessories are provided that may communicate over wired communications paths. The electronic devices may be portable electronic devices such as cellular telephones or media players and may have audio connectors such as 3.5 mm audio jacks. The accessories may be headsets or other equipment having mating 3.5 mm audio plugs and speakers for playing audio. Microphones may be included in an accessory to gather voice signals and noise cancellation signals. Analog-to-digital converter circuitry in the accessory may digitize the microphone signals. Digital voice signals and voice noise cancellation signals can be transmitted over the communications path and processed by audio digital signal processor circuitry in an electronic device. Digital-to-analog converter circuitry in the accessory may convert digital audio signals to analog speaker signals.
    Type: Application
    Filed: June 12, 2012
    Publication date: October 4, 2012
    Inventors: Wendell B. Sander, Jeffrey J. Terlizzi, Brian Sander, David Tupman, Barry Corlett
  • Publication number: 20120253794
    Abstract: A method of converting speech from the characteristics of a first voice to the characteristics of a second voice, the method comprising: receiving a speech input from a first voice, dividing said speech input into a plurality of frames; mapping the speech from the first voice to a second voice; and outputting the speech in the second voice, wherein mapping the speech from the first voice to the second voice comprises, deriving kernels demonstrating the similarity between speech features derived from the frames of the speech input from the first voice and stored frames of training data for said first voice, the training data corresponding to different text to that of the speech input and wherein the mapping step uses a plurality of kernels derived for each frame of input speech with a plurality of stored frames of training data of the first voice.
    Type: Application
    Filed: August 25, 2011
    Publication date: October 4, 2012
    Applicant: Kabushiki Kaisha Toshiba
    Inventors: Byung Ha CHUN, Mark John Francis GALES
  • Publication number: 20120253795
    Abstract: An audio commenting and publishing system including a storage database, media content and a computing device all coupled together via a network. The computing device comprises a processor and an application executed by the processor configured to input audio data that a user wishes to associate with the media content from an audio recording mechanism or a memory device. The application is then able to store the audio data on the storage database and use the network address of the audio data along with the network address of the media content to publish the audio data and the media content such that a view is able to hear and access them concurrently at a network-accessible location.
    Type: Application
    Filed: March 30, 2012
    Publication date: October 4, 2012
    Inventor: Christopher C. Andrews
  • Patent number: 8280730
    Abstract: A method (400, 600, 700) and apparatus (220) for enhancing the intelligibility of speech emitted into a noisy environment. After filtering (408) ambient noise with a filter (304) that simulates the physical blocking of noise by a at least a part of a voice communication device (102) a frequency dependent SNR of received voice audio relative to ambient noise is computed (424) on a perceptual (e.g. Bark) frequency scale. Formants are identified (426, 600, 700) and the SNR in bands including certain formants are modified (508, 510) with formant enhancement gain factors in order to improve intelligibility. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains factors yielding combined gains which are clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532) and used to reconstruct (532, 534) an audio signal.
    Type: Grant
    Filed: May 25, 2005
    Date of Patent: October 2, 2012
    Assignee: Motorola Mobility LLC
    Inventors: Jianming J. Song, John C. Johnson
  • Patent number: 8280734
    Abstract: Generally methods for titling segments of recorded audio data are disclosed herein. An input from a voice activation module, a push button input or another user interface can provide a stimulus for a system or device to record title information. The title information can be received as an utterance, converted to text, and linked to a segment or body of recorded audio. A speech to text converter can perform the conversion from audio to text and the text can be displayed to a user. Then, the system can request and accept a confirmation from the user that the title information reflects a user's desires. In a recording retrieval mode, the system can display a plurality of titles with textual characters that represent a lingual translation of the title to the user and prompt the user for a user selection of a title. After such a selection is made, the recorded audio can be retrieved from memory and played back to the user over speakers or headphones.
    Type: Grant
    Filed: August 16, 2006
    Date of Patent: October 2, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Stewart J. Hyman, Stephen J. Watt
  • Patent number: 8280727
    Abstract: A voice band expansion device includes a time-frequency converter that calculates a frequency spectrum of a voice signal having a first frequency band; a separator that extracts, from the frequency spectrum, an envelope amplitude spectrum, a periodic amplitude spectrum, and a random amplitude spectrum; an envelope amplitude spectrum band expander that expands a frequency band to a second frequency band that is different from the first frequency band; a periodic amplitude spectrum band expander that expands a frequency band to the second frequency band; a random amplitude spectrum band expander that expands a frequency band of the random amplitude spectrum to the second frequency band; a broadband spectrum calculator that calculates a broadband frequency spectrum having the first frequency band and the second frequency band; and a frequency-time converter generates a voice signal having the first frequency band and the second frequency band.
    Type: Grant
    Filed: May 11, 2010
    Date of Patent: October 2, 2012
    Assignee: Fujitsu Limited
    Inventors: Kaori Endo, Takeshi Otani, Taro Togawa, Yasuji Ota
  • Patent number: 8280744
    Abstract: An audio decoder for decoding a multi-audio-object signal having an audio signal of a first type and an audio signal of a second type encoded therein is described, the multi-audio-object signal having a downmix signal and side information, the side information having level information of the audio signals of the first and second types in a first predetermined time/frequency resolution, and a residual signal specifying residual level values in a second predetermined time/frequency resolution, the audio decoder having a processor for computing prediction coefficients based on the level information; and an up-mixer for up-mixing the downmix signal based on the prediction coefficients and the residual signal to obtain a first up-mix audio signal approximating the audio signal of the first type and/or a second up-mix audio signal approximating the audio signal of the second type.
    Type: Grant
    Filed: October 17, 2008
    Date of Patent: October 2, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.
    Inventors: Oliver Hellmuth, Johannes Hilpert, Leonid Terentiev, Cornelia Falch, Andreas Hoelzer, Juergen Herre
  • Patent number: 8275611
    Abstract: An apparatus for adaptively suppressing noise in an input signal frequency spectrum derived from overlapping input frames is provided. The system includes a psychoacoustic power computation module configured to compute a noisy signal power in psychoacoustic bands, a voice activity scoring module configured to compute a probabilistic score for a presence of a speech, and a noise estimation module configured to estimate a noise power in the psychoacoustic bands based on information of past frames, the probabilistic score, and the computed noisy signal power. The system also includes a gain computation module configured to compute a gain for each frequency, based on a probabilistic heuristic, the probabilistic score and the information on the past frames, and a gain post-processing module configured to perform a gain time smoothing, a gain frequency smoothing, and a gain regulation for the computed gain.
    Type: Grant
    Filed: January 18, 2008
    Date of Patent: September 25, 2012
    Assignee: STMicroelectronics Asia Pacific Pte., Ltd.
    Inventors: Wenbo Zong, Yuan Wu, Sapna George
  • Publication number: 20120239386
    Abstract: In the field of communications, a method and a device for determining a decoding mode of in-band signaling are provided, which improve accuracy of in-band signaling decoding. The method includes: calculating a probability of each decoding mode of in-band signaling of a received signal at a predetermined moment by using a posterior probability algorithm; and from the calculated probabilities of the decoding modes, selecting a decoding mode having a maximum probability value as a decoding mode of the in-band signaling of the received signal at the predetermined moment. The method and the device are mainly used in a process for determining a decoding mode of in-band signaling in a speech frame transmission process.
    Type: Application
    Filed: June 1, 2012
    Publication date: September 20, 2012
    Applicant: Huawei Device Co., Ltd.
    Inventors: Nian Peng, Congli Mao, Zhiqun Chen, Nian Chen
  • Publication number: 20120236914
    Abstract: Methods and systems for communicating data on a cellular telephone voice channel are disclosed. The method includes segmenting a data stream into one or more n-bit symbols; identifying a human vocal sound corresponding to each n-bit symbol according to a predetermined assignment of each n-bit symbol to a human vocal sound; and retrieving data representing the human vocal sound, wherein data representing the human vocal sound is configured to be passed through a vocoder.
    Type: Application
    Filed: August 26, 2009
    Publication date: September 20, 2012
    Inventor: Gerhard Wessels
  • Patent number: 8271273
    Abstract: In order to achieve the best improvement of ITU G.711 related codec perceptual quality, perceptual weighting controlling parameter(s) should be at least adaptive to relative quantization error statistics or adaptive to signal level. When the relative quantization error statistics are larger or the signal level is lower, the perceptual weighting should be “stronger”, which means ? in (5) is smaller; when the relative quantization error statistics are smaller or the signal level is larger, the perceptual weighting should be “weaker”, which means ? in (5) is larger.
    Type: Grant
    Filed: September 2, 2008
    Date of Patent: September 18, 2012
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Patent number: 8271268
    Abstract: A method, system and computer-readable medium for generating, caching and transmitting textual equivalents of information contained in an audio signal are presented. The method includes generating a textual equivalent of at least a portion of a speech-based audio signal in one device into a textual equivalent, storing a portion of the textual equivalent in first device's memory and transmitting the stored textual equivalent to a another device.
    Type: Grant
    Filed: April 18, 2007
    Date of Patent: September 18, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Susann M. Keohane, Gerald F. McBrearty, Shawn P. Mullen, Jessica C. Murillo, Johnny Meng-Han Shieh
  • Patent number: 8271267
    Abstract: Provided are a scalable wide-band speech coding/decoding apparatus, method, and medium. An input wide-band speech input signal is first divided into a low-band signal and a high-band signal. The divided low-band signal is then coded using a code excited linear prediction (CELP) method. The divided high-band signal is coded using a harmonic method. A signal representing a difference between a synthetic signal obtained from the low-band and the high band, and a signal input to the low-band and the high-band is then coded using a modified discrete cosine transform (MDCT) method. The coded signal is then multiplexed. The multiplexed signal is then output. Accordingly, high quality speech can be achieved for all layers.
    Type: Grant
    Filed: July 21, 2006
    Date of Patent: September 18, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hosang Sung, Sangwook Kim, Rakesh Taori, Kangeun Lee
  • Patent number: 8271293
    Abstract: Provided are, among other things, systems, methods and techniques for decoding an audio signal from a frame-based bit stream. Each frame includes processing information pertaining to the frame and entropy-encoded quantization indexes representing audio data within the frame. The processing information includes: (i) code book indexes, (ii) code book application information specifying ranges of entropy-encoded quantization indexes to which the code books are to be applied, and (iii) window information. The entropy-encoded quantization indexes are decoded by applying the identified code books to the corresponding ranges of entropy-encoded quantization indexes. Subband samples are then generated by dequantizing the decoded quantization indexes, and a sequence of different window functions that were applied within a single frame of the audio data is identified based on the window information.
    Type: Grant
    Filed: March 28, 2011
    Date of Patent: September 18, 2012
    Assignee: Digital Rise Technology Co., Ltd.
    Inventor: Yuli You
  • Patent number: 8271275
    Abstract: A scalable encoding device capable of reducing an encoding rate to reduce a circuit scale while preventing sound quality deterioration of a decoded signal. An extension layer is coarsely divided into a system for processing a first channel and a system for processing a second channel. A sound source predictor for processing the first channel predicts a drive sound source signal of the first channel from a drive sound source signal of a monaural signal, and outputs the predicted drive sound source signal through a multiplier to a first CELP encoder. A sound source predictor for processing the second channel predicts the drive sound source signal of the second channel from the drive sound source signal of the monaural signal and the output from the first CELP encoder, and outputs the predicted drive sound source signal through a multiplier to a second CELP encoder.
    Type: Grant
    Filed: May 29, 2006
    Date of Patent: September 18, 2012
    Assignee: Panasonic Corporation
    Inventors: Michiyo Goto, Koji Yoshida
  • Patent number: 8265941
    Abstract: A method for decoding an audio signal comprises receiving a combined downmix, a combined object information, and a mix information, the combined downmix being generating using at least two downmix signals, the combined object information being made by combination of at least two sets of object information, generating a downmix processing information using the combined object information and the mix information, and processing the combined downmix using the downmix processing information. The method and an apparatus for decoding an audio signal comprising the combined downmix and the combined object information can control object gain and output in a remote conference and so on. The method and the apparatus for decoding audio signal that contains multi-object signals are fast and efficiently by reducing process time, computer resource, thereby relieving the resource requirement like the wide bandwidth by using the combined object information.
    Type: Grant
    Filed: December 6, 2007
    Date of Patent: September 11, 2012
    Assignee: LG Electronics Inc.
    Inventors: Hyen O Oh, Yang Won Jung
  • Patent number: 8265940
    Abstract: A method for the artificial extension of the bandwidth of speech signals involves: a) Provision of a wideband input speech signal (swbi(k)); b) Determination of the signal components (seb(k)) of the wideband input speech signal (swbi(k)) required for the bandwidth extension from an extension band from the wideband input speech signal (swbi(k)); c) Determination of the temporal envelopes of the signal components (seb(k)) determined for the bandwidth extension; d) Determination of the spectral envelopes of the signal components (seb(k)) determined for bandwidth extension; e) Encoding of the information for the temporal envelopes and the spectral envelopes, and provision of the encoded information by carrying out the extension of the bandwidth; f) Decoding of the encoded information and generation of the temporal envelopes and the spectral envelopes from the encoded information for the production of a bandwidth-extended output speech signal (swbo(k)).
    Type: Grant
    Filed: June 30, 2006
    Date of Patent: September 11, 2012
    Assignee: Siemens Aktiengesellschaft
    Inventors: Bernd Geiser, Peter Jax, Stefan Schandl, Herve Taddei, Aulis Telle, Peter Vary
  • Patent number: 8265930
    Abstract: The present invention relates to recording voice data using a voice communication device connected to a communication network and converting the voice data into a text file for delivery to a text communication device. In accordance with the present invention, the voice communication device may transfer the voice data in real-time or store the voice data on the device to be transmitted at a later time. Transcribing the voice data into a text file may be accomplished by automated computer software, either speaker-independent or speaker-dependent or by a human who transcribes the voice data into a text file. After transcribing the voice data into a text file, the text file may be delivered to a text communication device in a number of ways, such as email, file transfer protocol (FTP), or hypertext transfer protocol (HTTP).
    Type: Grant
    Filed: April 13, 2005
    Date of Patent: September 11, 2012
    Assignee: Sprint Communications Company L.P.
    Inventors: Bryce A. Jones, Raymond Edward Dickensheets
  • Patent number: 8265935
    Abstract: Device independent Media Processing Extension (MPX) data, corresponding to media data, may be decoded by a media rendering device and may be utilized to determine and/or execute processing steps and/or processing parameters for processing the media data. During the processing and/or rendering, processing steps and/or parameters may be dynamically determined and/or adjusted. A user preference profile, media rendering device profile and/or media rendering environment profile may be utilized to generate, store and/or restore MPX data. Furthermore, MPX data that may be input by a user, manufacturer or a vendor, may be stored in a plurality of ways, for example, within a media data file, an external file and/or within an MTP or PTP object property associated the media data. The media data may comprise one or more of video data, still image data and audio data, for example.
    Type: Grant
    Filed: July 8, 2008
    Date of Patent: September 11, 2012
    Inventor: Scott Krig
  • Publication number: 20120226494
    Abstract: A digital broadcast transmitting device is described that includes a packet generation unit configured to generate packetized elementary stream (PES) data by converting an inputted voice signal into an encoded voice signal and generating a voice stream packet including the encoded voice signal; a descriptor updating unit configured to update a component descriptor to include a component type identification (ID) and a change reservation ID, the component type ID indicating an encoding format of the encoded voice signal is MPEG surround format and the change reservation ID indicating a change of a format of the encoded voice signal to the MPEG surround format; a packetizing unit configured to generate section data by packetizing the component descriptor; a multiplexing unit configured to multiplex the PES data and the section data; and a modulation unit configured to modulate and transmit multiplexed data acquired from the multiplexing unit.
    Type: Application
    Filed: February 29, 2012
    Publication date: September 6, 2012
    Applicant: PANASONIC CORPORATION
    Inventor: Naoki EJIMA
  • Patent number: 8260607
    Abstract: Encoding an audio signal is provided wherein the audio signal includes a first audio channel and a second audio channel, the encoding comprising subband filtering each of the first audio channel and the second audio channel in a complex modulated filterbank to provide a first plurality of subband signals for the first audio channel and a second plurality of subband signals for the second audio channel, downsampling each of the subband signals to provide a first plurality of downsampled subband signals and a second plurality of downsampled subband signals, further subband filtering at least one of the downsampled subband signals in a further filterbank in order to provide a plurality of sub-subband signals, deriving spatial parameters from the sub-subband signals and from those downsampled subband signals that are not further subband filtered, and deriving a single channel audio signal comprising derived subband signals derived from the first plurality of downsampled subband signals and the second plurality of
    Type: Grant
    Filed: March 30, 2011
    Date of Patent: September 4, 2012
    Assignees: Koninklijke Philips Electronics, N.V., Dolby International AB
    Inventors: Lars Falck Villemoes, Per Ekstrand, Heiko Purnhagen, Erik Gosuinus Petrus Schuijers, Fransiscus Marinus Jozephus De Bont
  • Patent number: 8260613
    Abstract: A double talk detector for controlling the echo path estimation in a telecommunication system by indicating when a received coded speech signal is dominated by a non-echo signal; i.e., that so-called double talk exists. This is determined by extracting LSPs from a coded speech frame of the received coded speech signal when the signal power exceeds a first threshold value, converting each of said extracted LSPs into LSFs, and calculating the distance between each two adjacent LSFs. For each distance that is smaller than a second threshold, a spectral peak is located between the two LSFs, and it is determined whether said spectral peak is an echo or not. When a predetermined number of non-echo spectral peaks are located in the received speech signal, double talk will be indicated, and the echo path estimation may be disabled.
    Type: Grant
    Filed: February 21, 2007
    Date of Patent: September 4, 2012
    Assignee: Telefonaktiebolaget L M Ericsson (Publ)
    Inventor: Tonu Trump
  • Patent number: 8259910
    Abstract: A transcribing method may include receiving an audio message from a customer via a telephone, determining whether one of the agent transcribers is available, storing the audio message when an agent transcriber is not available, continuing to determine whether a transcriber is available, streaming in real time a streamed portion of the audio message to a first available agent transcriber for facilitating the transcription of the streamed portion of the audio message into a first portion of a transcription text file, providing subsequently a pre-streamed recorded portion of the audio message to a subsequently available second agent transcriber for facilitating the transcription of the pre-streamed recorded portion of the audio message into a second portion of the transcription text file while the streaming in real time is continuing with the first agent transcriber, and combining the first and second portions of the transcription text file into a consolidated text file.
    Type: Grant
    Filed: March 14, 2008
    Date of Patent: September 4, 2012
    Assignee: VoiceCloud
    Inventors: Sammy S. Afifi, Gerald J. Marolda, III
  • Patent number: 8260606
    Abstract: A basic idea of the invention is to ascertain information on the course of the bit rate switching during an active speech phase. According to the invention, during the speech phase, information on the percentage proportion of broadband active speech frames in comparison to narrowband active speech frames is compiled on the part of the decoder. A high percentage proportion of broadband active speech frames indicates that a broadband use is preferred on the part of the codec and therefore a need exists for synthesizing noise information in broadband form during a DTX phase.
    Type: Grant
    Filed: February 2, 2009
    Date of Patent: September 4, 2012
    Assignee: Siemens Enterprise Communications GmbH & Co. KG
    Inventors: Panji Setiawan, Stefan Schandl, Herve Taddei
  • Patent number: 8255234
    Abstract: An audio encoder and decoder use architectures and techniques that improve the efficiency of quantization (e.g., weighting) and inverse quantization (e.g., inverse weighting) in audio coding and decoding. The described strategies include various techniques and tools, which can be used in combination or independently. For example, an audio encoder quantizes audio data in multiple channels, applying multiple channel-specific quantizer step modifiers, which give the encoder more control over balancing reconstruction quality between channels. The encoder also applies multiple quantization matrices and varies the resolution of the quantization matrices, which allows the encoder to use more resolution if overall quality is good and use less resolution if overall quality is poor. Finally, the encoder compresses one or more quantization matrices using temporal prediction to reduce the bitrate associated with the quantization matrices. An audio decoder performs corresponding inverse processing and decoding.
    Type: Grant
    Filed: October 18, 2011
    Date of Patent: August 28, 2012
    Assignee: Microsoft Corporation
    Inventors: Naveen Thumpudi, Wei-Ge Chen
  • Patent number: 8255228
    Abstract: An efficient encoded representation of a first and a second input audio signal can be derived using correlation information indicating a correlation between the first and the second input audio signals, when a signal characterization information, indicating at least a first or a second, different characteristic of the input audio signal is additionally considered. Phase information indicating a phase relation between the first and the second input audio signals is derived, when the input audio signals have the first characteristic. The phase information and a correlation measure are included into the encoded representation when the input audio signals have the first characteristic, and only the correlation information is included into the encoded representation when the input audio signals have the second characteristic.
    Type: Grant
    Filed: January 11, 2011
    Date of Patent: August 28, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Johannes Hilpert, Bernhard Grill, Matthias Neusinger, Julien Robilliard, Maria Luis-Valero
  • Patent number: 8255232
    Abstract: An audio encoding method previously estimates better initial iterative values of global-gain and scalefactor for avoiding heavy calculation. The estimating process of the encoding method includes calculating the bit allocation of one frequency sample based on a sampling rate, a bit rate, and the number of audio channels according to an input frame, and the psychoacoustic model, searching one frequency sample having the greatest sample energy in each of a plurality of scalefactor bands, quantizing the frequency sample to comply with the bit allocation and to generate a corresponding scalefactor, searching a maximum scalefactor of all scalefactor bands corresponding to the input frame, and setting initial values of scalefactors and an initial value of global-gain for the quantization iterative loop process according to the corresponding scalefactor and the maximum scalefactor.
    Type: Grant
    Filed: July 30, 2008
    Date of Patent: August 28, 2012
    Assignee: RealTek Semiconductor Corp.
    Inventor: Wen-Haw Wang
  • Patent number: 8255206
    Abstract: The voice mixing method includes a first step for selecting voice information from a plurality of voice information, a second step for adding up all the selected voice information, a third step for obtaining a voice signal totaling the voice signals other than one voice signal, of the selected voice signals, a fourth step for encoding the voice information obtained in the second step, a fifth step for encoding the voice signal obtained in the third step, and a sixth step for copying the encoded information obtained in the fourth step into the encoded information in the fifth step.
    Type: Grant
    Filed: August 28, 2007
    Date of Patent: August 28, 2012
    Assignee: NEC Corporation
    Inventors: Hironori Ito, Kazunori Ozawa
  • Publication number: 20120215528
    Abstract: Provided is a speech recognition system, including: a first information processing device including a speech recognition processing unit for receiving data to be used for speech recognition transmitted via a network, carrying out speech recognition processing, and returning resultant data; and a second information processing device connected to the first information processing device via the network. The second information processing device performs conversion of the data into data having a format that disables a content thereof from being perceived and also enables the speech recognition processing unit to perform the speech recognition processing. Thereafter, the second information processing device transmits the data to be used for the speech recognition by the speech recognition processing unit and constructs resultant data returned from the first information processing device into a content of a valid and perceivable recognition result.
    Type: Application
    Filed: October 12, 2010
    Publication date: August 23, 2012
    Applicant: NEC CORPORATION
    Inventor: Kentaro Nagatomo
  • Publication number: 20120212337
    Abstract: An original text that is a representation of a narration of a patient encounter provided by a clinician may be received and re-formatted to produce a formatted text. One or more clinical facts may be extracted from the formatted text. A first fact of the clinical facts may be extracted from a first portion of the formatted text, and the first portion of the formatted text may be a formatted version of a first portion of the original text. A linkage may be maintained between the first fact and the first portion of the original text.
    Type: Application
    Filed: February 18, 2011
    Publication date: August 23, 2012
    Applicant: Nuance Communications, Inc.
    Inventors: Frank Montyne, David Decraene, Joeri Van der Vloet, Johan Raedemaeker, Ignace Desimpel, Frederik Coppens, Tom Deray, James R. Flanagan, Mariana Casella dos Santos, Marnix Holvoet, Maria van Gurp, David Hellman, Girija Yegnanarayanan, Karen Anne Doyle
  • Patent number: 8249861
    Abstract: A speech enhancement system that improves the intelligibility and the perceived quality of processed speech includes a frequency transformer and a spectral compressor. The frequency transformer converts speech signals from the time domain to the frequency domain. The spectral compressor compresses a pre-selected portion of the high frequency band and maps the compressed high frequency band to a lower band limited frequency range. The speech enhancement system may be built into, may be a unitary part of, or may be configured to interface other systems that process audio or high frequency signals.
    Type: Grant
    Filed: December 22, 2006
    Date of Patent: August 21, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Xueman Li, Phillip Hetherington, Alex Escott
  • Patent number: 8249860
    Abstract: Disclosed is an adaptive sound source vector quantization device capable of reducing deviation of the quantization accuracy of the adaptive sound source vector quantization of each sub-frame when performing an adaptive sound source vector quantization in a sub-frame unit by using a greater information amount in a first sub-frame than in a second sub-frame.
    Type: Grant
    Filed: December 14, 2007
    Date of Patent: August 21, 2012
    Assignee: Panasonic Corporation
    Inventors: Kaoru Sato, Toshiyuki Morii
  • Publication number: 20120209596
    Abstract: Disclosed are an encoding device and a decoding device which suppress the occurrence of pre-echo artifacts and post-echo artifacts caused by a high layer having a low temporal resolution, and which implement high subjective quality encoding and decoding. An encoding device (100) carries out scalable coding comprising a low layer, and a high layer having a lower temporal resolution than that of the low layer. A start point detection unit (or end point detection unit) (150) determines the start point (or end point) of sections of the decoded low layer signal which have audio, and when the start point (or end point) is determined, a second layer encoding unit (160) selects a bandwidth to be excluded from encoding on the basis of the spectral energy from the decoded first layer signal, excludes the selected bandwidth, and encodes an error signal.
    Type: Application
    Filed: October 19, 2010
    Publication date: August 16, 2012
    Applicant: PANASONIC CORPORATION
    Inventor: Masahiro Oshikiri
  • Patent number: 8244525
    Abstract: Embodiments of the invention provide a method and encoder for encoding a frame in of a communication system. The method includes calculating a first set of parameters associated with the frame, wherein said first set of parameters comprises filter bank parameters. The method further includes selecting, in a first stage, one of a plurality of encoding methods based on the first set of parameters one of modes for encoding, calculating a second set of parameters associated with the frame, selecting, in a second stage, one of the plurality of encoding methods based on the result of the first stage selection and the second set of parameters one of modes for encoding, and encoding the frame using the selected encoding excitation method from the second stage.
    Type: Grant
    Filed: November 22, 2004
    Date of Patent: August 14, 2012
    Assignee: Nokia Corporation
    Inventor: Jari M. Makinen
  • Patent number: 8244538
    Abstract: A system evaluates a hands free communication system. The system automatically selects a consonant-vowel-consonant (CVC), vowel-consonant-vowel (VCV), or other combination of sounds from an intelligent database. The selection is transmitted with another communication stream that temporally overlaps the selection. The quality of the communication system is evaluated through an automatic speech recognition engine. The evaluation occurs at a location remote from the transmitted selection.
    Type: Grant
    Filed: April 29, 2009
    Date of Patent: August 14, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Shreyas Paranjpe, Mark Fallat
  • Patent number: 8239209
    Abstract: An apparatus for decoding a signal and method thereof are disclosed, by which the audio signal can be controlled in a manner of changing/giving spatial characteristics (e.g., listener's virtual position, virtual position of a specific source) of the audio signal. The present invention includes receiving an object parameter including level information corresponding to at least one object signal, converting the level information corresponding to the object signal to the level information corresponding to an output channel by applying a control parameter to the object parameter, and generating a rendering parameter including the level information corresponding to the output channel to control an object downmix signal resulting from downmixing the object signal.
    Type: Grant
    Filed: January 19, 2007
    Date of Patent: August 7, 2012
    Assignee: LG Electronics Inc.
    Inventors: Hyen-O Oh, Hee Suk Pang, Dong Soo Kim, Jae Hyun Lim, Yang-Won Jung
  • Publication number: 20120197633
    Abstract: A voice quality measurement device that measures voice quality of a decoded voice signal outputted from a voice decoder unit. The voice quality measurement device includes a packet buffer unit and a voice information monitoring unit. The packet buffer unit accumulates voice packets that arrive non-periodically as voice information, and outputs the voice information to the voice decoder unit periodically. The voice information monitoring unit monitors continuity of the voice information inputted to the voice decoder unit, and calculates an index of voice quality of the decoded voice signal that reflects acceptability of this continuity.
    Type: Application
    Filed: November 25, 2011
    Publication date: August 2, 2012
    Applicant: OKI ELECTRIC INDUSTRY CO., LTD.
    Inventor: Hiromi AOYAGI
  • Publication number: 20120197634
    Abstract: A voice correction device includes a detector that detects a response from a user, a calculator that calculates an acoustic characteristic amount of an input voice signal, an analyzer that outputs an acoustic characteristic amount of a predetermined amount when having acquired a response signal due to the response from the detector, a storage unit that stores the acoustic characteristic amount output by the analyzer, a controller that calculates an correction amount of the voice signal on the basis of a result of a comparison between the acoustic characteristic amount calculated by the calculator and the acoustic characteristic amount stored in the storage unit, and a correction unit that corrects the voice signal on the basis of the correction amount calculated by the controller.
    Type: Application
    Filed: December 20, 2011
    Publication date: August 2, 2012
    Applicant: FUJITSU LIMITED
    Inventors: Chisato ISHIKAWA, Takeshi OTANI, Taro TOGAWA, Masanao SUZUKI, Masakiyo TANAKA
  • Publication number: 20120197635
    Abstract: A method for generating an audio signal of a user is provided. According to the method, a first audio signal inside of an ear of the user and a second audio signal outside of the ear is detected. The first audio signal and the second audio signal comprise at least a voice signal component generated by the user. Depending on the first audio signal the second audio signal is processed and output as the audio signal.
    Type: Application
    Filed: January 5, 2012
    Publication date: August 2, 2012
    Applicant: SONY ERICSSON MOBILE COMMUNICATIONS AB
    Inventor: Martin NYSTRÖM
  • Publication number: 20120185240
    Abstract: An embodiment provides a system and method for generating and sending a simplified message using speech recognition. The system provides a speech recognition software that may be utilized for receiving audio, converting audio to text derived from audio, comparing text derived from audio to match fields to find matches, replacing matched text with contents of replacement fields associated to the match fields, generating an output message incorporating the replacement text into the text derived from audio, transmitting the output message to a messaging system and redistributing the output message to recipients.
    Type: Application
    Filed: January 13, 2012
    Publication date: July 19, 2012
    Inventors: Michael D. Goller, Stuart E. Goller
  • Publication number: 20120185241
    Abstract: An audio decoding apparatus comprises: a plurality of decoding units; a band replicating unit which processes a decoded signal obtained when a corresponding decoding unit decodes a coded signal, according to a scheme specified by transmitted information; and an information transmitting unit which transmits, to a signal processing unit, information identifying the corresponding decoding unit from among the plurality of decoding units.
    Type: Application
    Filed: March 28, 2012
    Publication date: July 19, 2012
    Applicant: Panasonic Corporation
    Inventors: Shuji MIYASAKA, Kosuke Nishio, Takeshi Norimatsu
  • Publication number: 20120185245
    Abstract: An apparatus is provided with a device storing machine readable code and a processor executing the machine readable code. The machine readable code includes sound setting code and audio processing code. The sound setting code detects use of a microphone and sets sound characteristics that are suitable for conversation in response to detecting the use of the microphone. The audio processing code processes sound on the basis of the sound characteristics set by the sound setting code.
    Type: Application
    Filed: January 10, 2012
    Publication date: July 19, 2012
    Applicant: LENOVO (SINGAPORE) PTE, LTD.
    Inventors: Shinichi Kikuchi, Hironari Nishino, Yasushi Tsukamoto