For Storage Or Transmission Patents (Class 704/201)
  • Publication number: 20120179457
    Abstract: Techniques for combining the results of multiple recognizers in a distributed speech recognition architecture. Speech data input to a client device is encoded and processed both locally and remotely by different recognizers configured to be proficient at different speech recognition tasks. The client/server architecture is configurable to enable network providers to specify a policy directed to a trade-off between reducing recognition latency perceived by a user and usage of network resources. The results of the local and remote speech recognition engines are combined based, at least in part, on logic stored by one or more components of the client/server architecture.
    Type: Application
    Filed: January 6, 2012
    Publication date: July 12, 2012
    Applicant: Nuance Communications, Inc.
    Inventors: Michael Newman, Anthony Gillet, David Mark Krowitz, Michael D. Edgington
  • Publication number: 20120179460
    Abstract: A method for testing an automated interactive media system. The method can include establishing a communication session with the automated interactive media system. In response to receiving control and/or media information from the automated interactive media system, pre-recorded control and/or media information can be propagated to the automated interactive media system. The pre-recorded control and/or media information can be recorded in real time.
    Type: Application
    Filed: March 17, 2012
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: WILLIAM V. DA PALMA, BRIEN H. MUSCHETT
  • Patent number: 8219400
    Abstract: Stereo to mono voice conferencing conversion is performed during a voice conference. Conferencing equipment receives audio for right and left channels and filters each of the channels into a plurality of bands. For each band of each channel, the equipment determines an energy level and compares each energy level for each band of the right channel to each energy level for each corresponding band of the left channel. Based on the comparison, the equipment determines which channel has more audio resulting from speech. Based on the determination, the equipment adjusts delivery of the audio from the right and left channels to a mono channel for transmission to endpoints only capable of mono audio in the voice conference.
    Type: Grant
    Filed: November 21, 2008
    Date of Patent: July 10, 2012
    Assignee: Polycom, Inc.
    Inventor: Peter L. Chu
  • Patent number: 8219388
    Abstract: A sensation of presence of voice chat in a virtual space is enhanced. A user speech synthesizer used in a virtual space sharing system where information processing devices share the virtual space. The user speech synthesizer comprises a speech data acquiring section (60) for acquiring speech data representing a speech uttered by the user of one of the information processing devices, an environment sound storage section (66) for storing an environment sound associated with one or more regions defined in the virtual space, a region specifying section (64) for specifying a region corresponding to the user in the virtual space, and an environment sound synthesizing section (68) for acquiring the environment sound associated with the specified region from the environment sound storage section (66), combining the acquired environment sound and the speech data and synthesizing synthesized speech data.
    Type: Grant
    Filed: June 7, 2006
    Date of Patent: July 10, 2012
    Assignee: Konami Digital Entertainment Co., Ltd.
    Inventors: Hiromasa Kaneko, Masaki Takeuchi
  • Patent number: 8219389
    Abstract: A speech enhancement system that improves the intelligibility and the perceived quality of processed speech includes a frequency transformer and a spectral compressor. The frequency transformer converts speech signals from the time domain to the frequency domain. The spectral compressor compresses a pre-selected portion of the high frequency band and maps the compressed high frequency band to a lower band limited frequency range.
    Type: Grant
    Filed: December 23, 2011
    Date of Patent: July 10, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Phillip A. Hetherington, Xueman Li
  • Patent number: 8214218
    Abstract: A method and an apparatus for switching speech or audio signals, wherein the method for switching speech or audio signals includes when switching of a speech or audio, weighting a first high frequency band signal of a current frame of speech or audio signal and a second high frequency band signal of the previous M frame of speech or audio signals to obtain a processed first high frequency band signal, where M is greater than or equal to 1, and synthesizing the processed first high frequency band signal and a first low frequency band signal of the current frame of speech or audio signal into a wide frequency band signal. In this way, speech or audio signals with different bandwidths can be smoothly switched, thus improving the quality of audio signals received by a user.
    Type: Grant
    Filed: June 16, 2011
    Date of Patent: July 3, 2012
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Zexin Liu, Lei Miao, Chen Hu, Wenhai Wu, Yue Lang, Qing Zhang
  • Patent number: 8213624
    Abstract: The perceived loudness of an audio signal is measured by modifying a spectral representation of an audio signal as a function of a reference spectral shape so that the spectral representation of the audio signal conforms more closely to the reference spectral shape, and determining the perceived loudness of the modified spectral representation of the audio signal.
    Type: Grant
    Filed: June 18, 2008
    Date of Patent: July 3, 2012
    Assignee: Dolby Laboratories Licensing Corporation
    Inventor: Alan Seefeldt
  • Patent number: 8214204
    Abstract: A method for compressing data, the data being represented by an input vector having Q features, wherein Q is an integer higher than 1, including the steps of 1) providing a vector codebook of sub-sets of indexed Q-feature reference vectors and threshold values associated with the sub-sets for a prefixed feature; 2) identifying a sub-set of reference vectors among the sub-sets by progressively comparing the value of a feature of the input vector which corresponds to the prefixed feature, with the threshold values associated with the sub-sets; and 3) identifying the reference vector which, within the sub-set identified in step 2), provides the lowest distortion with respect to the input vector.
    Type: Grant
    Filed: July 23, 2004
    Date of Patent: July 3, 2012
    Assignee: Telecom Italia S.p.A.
    Inventors: Maurizio Fodrini, Donato Ettorre, Gianmario Bollano
  • Patent number: 8214175
    Abstract: A method and system for monitoring and analyzing at least one signal are disclosed. An abstract of at least one reference signal is generated and stored in a reference database. An abstract of a query signal to be analyzed is then generated so that the abstract of the query signal can be compared to the abstracts stored in the reference database for a match. The method and system may optionally be used to record information about the query signals, the number of matches recorded, and other useful information about the query signals. Moreover, the method by which abstracts are generated can be programmable based upon selectable criteria. The system can also be programmed with error control software so as to avoid the re-occurrence of a query signal that matches more than one signal stored in the reference database.
    Type: Grant
    Filed: February 26, 2011
    Date of Patent: July 3, 2012
    Assignee: Blue Spike, Inc.
    Inventors: Scott Moskowitz, Mike W. Berry
  • Patent number: 8214200
    Abstract: Methods and apparatus are disclosed for approximating an MDCT coefficient of a block of windowed sinusoid having a defined frequency, the block being multiplied by a window sequence and having a block length and a block index. A finite trigonometric series is employed to approximate the window sequence. A window summation table is pre-computed using the finite trigonometric series and the defined frequency of the sinusoid. A block phase is computed for each block with the defined frequency, the block length and the block index. An MDCT coefficient is approximated by the dot product of a phase vector computed using the block phase with a corresponding row of the window summation table.
    Type: Grant
    Filed: March 14, 2007
    Date of Patent: July 3, 2012
    Assignee: XFRM, Inc.
    Inventors: Richard C. Cabot, Matthew S. Ashman
  • Publication number: 20120166184
    Abstract: Systems and methods that provide for voice command devices that receive sound but do not transfer the voice data beyond the system unless certain voice-filtering criteria have been met are described herein. In addition, embodiments provide devices that support voice command operation while external voice data transmission is in mute operation mode. As such, devices according to embodiments may process voice data locally responsive to the voice data matching voice-filtering criteria. Furthermore, systems and methods are described herein involving voice command devices that capture sound and analyze it in real-time on a word-by-word basis and decide whether to handle the voice data locally, transmit it externally, or both.
    Type: Application
    Filed: December 23, 2010
    Publication date: June 28, 2012
    Applicant: Lenovo (Singapore) Pte. Ltd.
    Inventors: Howard Locker, Daryl Cromer, Scott Edwards Kelso, Aaron Michael Stewart
  • Publication number: 20120166185
    Abstract: A method and system for accomplishing closed-loop transaction processing in conjunction with interactive, real-time, voice transmission of information to a user is disclosed. A voice-based communication between a user and a first system is established and a report is transmitted to the user. The report might comprise information and at least one request for user input based on said information. In response to the report, the user can request a transaction based on said information. The requested transaction is completed automatically by connecting to a second system for processing.
    Type: Application
    Filed: March 6, 2012
    Publication date: June 28, 2012
    Applicant: MicroStrategy, Incorporated
    Inventors: Michael Zirngibl, Anurag Patnaik, Bodo Maass, Christopher S. Leon
  • Publication number: 20120163565
    Abstract: An automated method and system are described for obtaining and sharing references and testimonials for individuals and companies. A voice sharing system is provided which obtains and shares voice reference recordings from reference granters for a reference requester. Reference receivers can then listen to a voice reference recording by selecting an icon on a web page. A reference requester can be an individual who needs references as part of an employment search. A reference requester may alternatively be a company needing a testimonial about their products or services.
    Type: Application
    Filed: December 17, 2011
    Publication date: June 28, 2012
    Inventors: Weihui Li, Tao Zhang
  • Patent number: 8209190
    Abstract: During operation an input signal to be coded is received and coded to produce a coded audio signal. The coded audio signal is then scaled with a plurality of gain values to produce a plurality of scaled coded audio signals, each having an associated gain value and a plurality of error values are determined existing between the input signal and each of the plurality of scaled coded audio signals. A gain value is then chosen that is associated with a scaled coded audio signal resulting in a low error value existing between the input signal and the scaled coded audio signal. Finally, the low error value is transmitted along with the gain value as part of an enhancement layer to the coded audio signal.
    Type: Grant
    Filed: August 7, 2008
    Date of Patent: June 26, 2012
    Assignee: Motorola Mobility, Inc.
    Inventors: James P. Ashley, Jonathan A. Gibbs, Udar Mittal
  • Patent number: 8209188
    Abstract: A down-sampler 101 down-samples the sampling rate of an input signal from sampling rate FH to sampling rate FL. A base layer coder 102 encodes the sampling rate FL acoustic signal. A local decoder 103 decodes coding information output from base layer coder 102. An up-sampler 104 raises the sampling rate of the decoded signal to FH. A subtracter 106 subtracts the decoded signal from the sampling rate FH acoustic signal. An enhancement layer coder 107 encodes the signal output from subtracter 106 using a decoding result parameter output from local decoder 103.
    Type: Grant
    Filed: May 6, 2010
    Date of Patent: June 26, 2012
    Assignee: Panasonic Corporation
    Inventor: Masahiro Oshikiri
  • Patent number: 8204198
    Abstract: An active stream is selected from one of a plurality of audio streams generated in a common acoustic environment by obtaining, for each stream obtaining, at a series of measurement instants tn, where n=1 . . . N, a final performance metric that is representative of the stream's goodness for representing near-end speech at each measurement instant tn. The final performance metrics for each stream are accumulated to determine an overall performance score. A switch to a new stream as the active stream occurs when the stream with the best overall performance score exceeds the overall performance score of the currently active stream by a threshold amount.
    Type: Grant
    Filed: June 19, 2009
    Date of Patent: June 19, 2012
    Assignee: Magor Communications Corporation
    Inventor: Kathryn Adeney
  • Patent number: 8204740
    Abstract: An encoding/decoding method, an coder/decoder (codec) and a radio communication device utilize a variable offset coding technique. In accordance with the technique, the start of processing of a first frame is time offset in relation to the end of the processing of the frame that precedes the first frame, the time offset bringing about a time gap between the end of the preceding frame and the start of processing the first frame. A substitution signal is inserted in the time gap.
    Type: Grant
    Filed: February 6, 2006
    Date of Patent: June 19, 2012
    Assignee: Telefonaktiebolaget LM Ericsson (Publ)
    Inventor: Stefan Bruhn
  • Patent number: 8200500
    Abstract: Generic and specific C-to-E binaural cue coding (BCC) schemes are described, including those in which one or more of the input channels are transmitted as unmodified channels that are not downmixed at the BCC encoder and not upmixed at the BCC decoder. The specific BCC schemes described include 5-to-2, 6-to-5, 7-to-5, 6.1-to-5.1, 7.1-to-5.1, and 6.2-to-5.1, where “0.1” indicates a single low-frequency effects (LFE) channel and “0.2” indicates two LFE channels.
    Type: Grant
    Filed: March 14, 2011
    Date of Patent: June 12, 2012
    Assignee: Agere Systems Inc.
    Inventors: Frank Baumgarte, Jiashu Chen, Christof Faller
  • Patent number: 8200497
    Abstract: Synthesizing a set of digital speech samples corresponding to a selected voicing state includes dividing speech model parameters into frames, with a frame of speech model parameters including pitch information, voicing information determining the voicing state in one or more frequency regions, and spectral information. First and second digital filters are computed using, respectively, first and second frames of speech model parameters, with the frequency responses of the digital filters corresponding to the spectral information in frequency regions for which the voicing state equals the selected voicing state. A set of pulse locations are determined, and sets of first and second signal samples are produced using the pulse locations and, respectively, the first and second digital filters. Finally, the sets of first and second signal samples are combined to produce a set of digital speech samples corresponding to the selected voicing state.
    Type: Grant
    Filed: August 21, 2009
    Date of Patent: June 12, 2012
    Assignee: Digital Voice Systems, Inc.
    Inventor: John C. Hardwick
  • Patent number: 8200479
    Abstract: Methods and mobile devices are provided for asymmetric independent processing of audio streams in a system on a chip (SOC). More specifically, independent audio paths are provided for processors performing audio processing on the SOC and mixing of decoded audio samples from the processors is performed digitally on the SOC by a hardware digital mixer.
    Type: Grant
    Filed: December 23, 2008
    Date of Patent: June 12, 2012
    Assignee: Texas Instruments Incorporated
    Inventors: Stephane Sintes, Franck Seigneret, Christophe Favergeon-Borgialli
  • Patent number: 8200480
    Abstract: A method including: obtaining, via a plurality of communication devices, a plurality of speech signals respectively associated with human speakers, the speech signals including verbal components and non-verbal components; identifying a plurality of geographical locations, each geographic location associated with a respective one of the plurality of the communication devices; extracting the non-verbal components from the obtained speech signals; deducing physiological or psychological conditions of the human speakers by analyzing, over a specified period, the extracted non-verbal components, using predefined relations between characteristics of the non-verbal components and physiological or psychological conditions of the human speakers; and providing a geographical distribution of the deduced physiological or psychological conditions of the human speakers by associating the deduced physiological or psychological conditions of the human speakers with geographical locations thereof.
    Type: Grant
    Filed: September 30, 2009
    Date of Patent: June 12, 2012
    Assignee: International Business Machines Corporation
    Inventors: Slava Shechtman, Raphael Steinberg
  • Publication number: 20120143602
    Abstract: A method for decoding segmented speech frames includes: generating parameters of a segmented current speech frame by using parameters of a segmented previous speech frame; and decoding a speech frame by using the parameters of the current speech frame, which are generated in the generating of the parameters of the segmented current speech frame.
    Type: Application
    Filed: July 26, 2011
    Publication date: June 7, 2012
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Kyung Jin BYUN, Nak Woong EUM, Hee-Bum JUNG
  • Patent number: 8195449
    Abstract: A non-intrusive signal quality assessment apparatus includes a feature vector calculator that determines parameters representing frames of a signal and extracts a collection of per-frame feature vectors (?;(n)) representing structural information of the signal from the parameters. A frame selector preferably selects only frames (?\with a feature vector (?;(n)) lying within a predetermined multi-dimensional window (?). Means determine a global feature set (?) over the collection of feature vectors (?;(n)) from statistical moments of selected feature vector components ((1^,02, . . . O11). A quality predictor predicts a signal quality measure (Qj from the global feature set (?)).
    Type: Grant
    Filed: January 30, 2007
    Date of Patent: June 5, 2012
    Assignee: Telefonaktiebolaget L M Ericsson (Publ)
    Inventors: Stefan Bruhn, Volodya Grancharov, Willem Bastiaan Kleijn
  • Patent number: 8195450
    Abstract: There is provided a method for use by a speech encoder to encode an input speech signal.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: June 5, 2012
    Assignee: Mindspeed Technologies, Inc.
    Inventors: Eyal Shlomot, Yang Gao, Adil Benyassine
  • Publication number: 20120130709
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for building an automatic speech recognition system through an Internet API. A network-based automatic speech recognition server configured to practice the method receives feature streams, transcriptions, and parameter values as inputs from a network client independent of knowledge of internal operations of the server. The server processes the inputs to train an acoustic model and a language model, and transmits the acoustic model and the language model to the network client. The server can also generate a log describing the processing and transmit the log to the client. On the server side, a human expert can intervene to modify how the server processes the inputs. The inputs can include an additional feature stream generated from speech by algorithms in the client's proprietary feature extraction.
    Type: Application
    Filed: November 23, 2010
    Publication date: May 24, 2012
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Enrico BOCCHIERI, Dimitrios Dimitriadis, Horst J. Schroeter
  • Publication number: 20120120218
    Abstract: A system and method providing semi-private conversation using an area microphone between one local user in a group of local users and a remote user. The local and remote users may be in different physical environments, using devices coupled by a network. A conversational relationship is defined between a local user and a remote user. The local user's voice is isolated from other voices in the environment, and transmitted to the remote user. Directional output technology may be used to direct the local user's utterances to the remote user in the remote environment.
    Type: Application
    Filed: November 15, 2010
    Publication date: May 17, 2012
    Inventors: Jason S. Flaks, Avi Bar-Zeev
  • Publication number: 20120116752
    Abstract: Audio data processing method and an audio data processing system are described. The audio data processing system includes an audio collect module, a processing module, a virtual play module, a virtual collect module, and a buffer memory. The virtual play module and the virtual collect module are registered in an application interface layer of a third-part software. The third-part software chooses the virtual play module and the virtual collect module. The virtual play module is configured for receiving audio data processed by the processing module and storing the processed audio data in the buffer memory. The virtual collect module is configured for collecting the processed audio data from the buffer memory and transmitting the processed audio data to the third-part software. The invention provides a universal solution suitable for any chatting tool by installing the virtual speaker and the virtual microphone.
    Type: Application
    Filed: December 24, 2011
    Publication date: May 10, 2012
    Inventor: Hong CAO
  • Patent number: 8175867
    Abstract: A voice communication apparatus includes a communication portion that receives a plurality of frames including at least a first frame having first voice data and a second frame having second voice data subsequent to the first frame, the first voice data and the second voice data being encoded by a predetermined encoding system, a decoding portion that decodes the first voice data and the second voice data received by the communication portion, a buffer that retains the first voice data and the second voice data decoded by the decoding portion, a calculation portion that calculates an amplitude envelope based on the first voice data decoded by the decoding portion, and a controlling portion that judges whether or not the second voice data decoded by the decoding portion exceeds the amplitude envelope and corrects the second voice data that exceeds the amplitude envelope.
    Type: Grant
    Filed: August 5, 2008
    Date of Patent: May 8, 2012
    Assignee: Panasonic Corporation
    Inventors: Shinji Ikegami, Jyunichi Maehara, Noriaki Fukuoka, Toshihiro Tsukamoto
  • Patent number: 8175871
    Abstract: Multiple microphone noise suppression apparatus and methods are described herein. The apparatus and methods implement a variety of noise suppression techniques and apparatus that can be selectively applied to signals received using multiple microphones. The microphone signals received at each of the multiple microphones can be independently processed to cancel echo signal components that can be generated from a local audio source. The echo cancelled signals may be processed by some or all modules within a signal separator that operates to separate or otherwise isolate a speech signal from noise signals. The signal separator can include a pre-processing de-correlator followed by a blind source separator. The output of the blind source separator can be post filtered to provide post separation de-correlation. The separated speech and noise signals can be non-linearly processed for further noise reduction, and additional post processing can be implemented following the non-linear processing.
    Type: Grant
    Filed: September 28, 2007
    Date of Patent: May 8, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Song Wang, Samir Kumar Gupta, Eddie L. T. Choy
  • Publication number: 20120109644
    Abstract: There is a need to enable decompression of a speech signal even if no network synchronizing signal is output from a baseband processing portion. For this purpose, an information processing device includes a first serial interface. The first serial interface includes a notification signal generation circuit that generates a notification signal each time compressed data incorporated from the baseband processing portion reaches a predetermined data quantity, and notifies a speech processing portion of this state using the notification signal. The speech processing portion includes a synchronizing signal generation circuit that generates a network synchronizing signal based on the notification signal. A clock signal for PCM communication is generated based on the network synchronizing signal. A speech signal can be decompressed even if no network synchronizing signal is output from the baseband processing portion.
    Type: Application
    Filed: October 31, 2011
    Publication date: May 3, 2012
    Inventors: Yutaka Uchimura, Takahiro Irita, Jiro Hara
  • Publication number: 20120109643
    Abstract: A system and method provide an audio/video coding system for adaptively transcoding audio streams based on content characteristics of the audio streams. An audio stream metadata extraction module of the system is configured to extract metadata of a source audio stream. An audio stream classification module of the system is configured to classify the source audio stream into one of the several audio content categories based on the metadata of the source audio stream. An adaptive audio encoder of the system is configured to determine one or more transcoding parameters including target bitrate and sampling rate based on the metadata and classification of the source audio stream. An adaptive audio transcoder of the system is configured to transcode the source audio stream into an output audio stream using the transcoding parameters.
    Type: Application
    Filed: November 2, 2010
    Publication date: May 3, 2012
    Applicant: GOOGLE INC.
    Inventors: Xiaoquan Yi, Huisheng Wang, Vijnan Shastri
  • Patent number: 8170885
    Abstract: Disclosed is a wideband audio signal coding/decoding device and method that may code a wideband audio signal while maintaining a low bit rate. The wideband audio signal coding device includes an enhancement layer that extracts a first spectrum parameter from an inputted wideband signal having a first bandwidth, quantizes the extracted first spectrum parameter, and converts the extracted first spectrum parameter into a second spectrum parameter; and a coding unit that extracts a narrowband signal from the inputted wideband signal and codes the narrowband signal based on the second spectrum parameter provided from the enhancement layer, wherein the narrowband signal has a second bandwidth smaller than the first bandwidth. The wideband audio signal coding/decoding device and method may code a wideband audio signal while maintaining a low bit rate.
    Type: Grant
    Filed: October 15, 2008
    Date of Patent: May 1, 2012
    Assignee: Gwangju Institute of Science and Technology
    Inventors: Hong Kook Kim, Young Han Lee
  • Publication number: 20120101814
    Abstract: Various techniques are disclosed for improving packet loss concealment to reduce artifacts by using audio character measures of the audio signal. These techniques include attenuation to a noise fill instead of attenuation to silence, varying how long to wait before attenuating the extrapolation, varying the rate of attenuation of the extrapolation, attenuating periodic extrapolation at a different rate than non-periodic extrapolation, and performing period extrapolation on successively longer fill data based on the audio character measures, adjusting weighting between periodic and non-periodic extrapolation based on the audio character measures, and adjusting weighting between periodic extrapolation and non-periodic extrapolation non-linearly.
    Type: Application
    Filed: October 25, 2010
    Publication date: April 26, 2012
    Applicant: POLYCOM, INC.
    Inventor: Eric David Elias
  • Publication number: 20120101812
    Abstract: Techniques for generating, distributing, and using speech recognition models are described. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. Thus, the speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device's speech processing capability.
    Type: Application
    Filed: December 30, 2011
    Publication date: April 26, 2012
    Applicant: GOOGLE INC.
    Inventors: Craig Reding, Suzi Levas
  • Patent number: 8165889
    Abstract: Spatial information associated with an audio signal is encoded into a bitstream, which can be transmitted to a decoder or recorded to a storage media. The bitstream can include different syntax related to time, frequency and spatial domains. In some embodiments, the bitstream includes one or more data structures (e.g., frames) that contain ordered sets of slots for which parameters can be applied. The data structures can be fixed or variable. The data structure can include position information that can be used by a decoder to identify the correct slot for which a given parameter set is applied. The slot position information can be encoded with a fixed number of bits or a variable number of bits based on the data structure type.
    Type: Grant
    Filed: July 19, 2010
    Date of Patent: April 24, 2012
    Assignee: LG Electronics Inc.
    Inventors: Hee Suk Pang, Dong Soo Kim, Jae Hyun Lim, Hyen O Oh, Yang-Won Jung
  • Patent number: 8165882
    Abstract: Apparatus and method for generating high quality synthesized speech having smooth waveform concatenation. The apparatus includes a pitch frequency calculation section, a pitch synchronization position calculation section, a unit waveform storage, a unit waveform selection section, a unit waveform generation section, and a waveform synthesis section. The unit waveform generation section includes a conversion ratio calculation section, a sampling rate conversion section, and a unit waveform re-selection section. The conversion ratio calculation section calculates a sampling rate conversion ratio from the pitch information and the position of pitch synchronization, and the sampling rate conversion section converts the sampling rate of the unit waveform, delivered as input, based on the sampling rate conversion ratio.
    Type: Grant
    Filed: September 4, 2006
    Date of Patent: April 24, 2012
    Assignee: NEC Corporation
    Inventors: Masanori Kato, Satoshi Tsukada
  • Patent number: 8165871
    Abstract: Provided are an encoding method and apparatus for efficiently encoding a sinusoidal signal whose magnitude is less than a masking value according to a psychoacoustic model, a decoding method and apparatus for decoding an encoded sinusoidal signal, and a computer-readable recording medium having recorded thereon a program for executing the encoding method/the decoding method. By using a particular code indicating that the magnitude of a first sinusoidal signal is less than a masking value according to a psychoacoustic model to encode the first sinusoidal signal, difference coding for a third sinusoidal signal of a next frame, which is connected to the first sinusoidal signal, is performed using a sinusoidal signal or sinusoidal signals selected according to a method to use the particular code, and a decoding apparatus obtains a sum with a transmitted difference using the selected sinusoidal signal(s).
    Type: Grant
    Filed: June 2, 2008
    Date of Patent: April 24, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Nam-suk Lee, Geon-hyoung Lee, Chul-woo Lee, Han-gil Moon
  • Patent number: 8165873
    Abstract: A speech analysis apparatus analyzing prosodic characteristics of speech information and outputting a prosodic discrimination result includes an input unit inputting speech information, an acoustic analysis unit calculating relative pitch variation and a discrimination unit performing speech discrimination processing, in which the acoustic analysis unit calculates a current template relative pitch difference, determining whether a difference absolute value between the current template relative pitch difference and a previous template relative pitch difference is equal to or less than a predetermined threshold or not, when the value is not less than the threshold, calculating an adjacent relative pitch difference, and when the adjacent relative pitch difference is equal to or less than a previously set margin value, executing correction processing of adding or subtracting an octave of the current template relative pitch difference to calculate the relative pitch variation by applying the relative pitch differe
    Type: Grant
    Filed: July 21, 2008
    Date of Patent: April 24, 2012
    Assignee: Sony Corporation
    Inventor: Keiichi Yamada
  • Publication number: 20120095749
    Abstract: Audiovisual presentation methods, systems and apparatus for improving and enhancing the listening experience of attendees of audiovisual presentations. An exemplary audiovisual presentation system includes an audio processing and distribution unit (APDU) configured to generate and broadcast a wireless audio service containing audio of an audiovisual presentation (e.g., soundtrack and dialogue audio of a movie, in the case of a movie presentation) throughout an audiovisual presentation room or space (e.g., a movie theater, in the case of a movie presentation). The wireless audio service is received by mobile receiving devices (MRDs) having or comprising headsets, headphones or earbuds, through which MRD users listen to the audio of the audiovisual presentation provided by the wireless audio service while viewing images of the audiovisual presentation.
    Type: Application
    Filed: October 13, 2011
    Publication date: April 19, 2012
    Inventor: Antonio Capretta
  • Patent number: 8160873
    Abstract: In a noise suppression apparatus for suppressing noise contained in a speech signal, the speech signal is converted to a first vector of spectral speech components and a second vector of spectral speech components identical to the first vector. A vector of noise suppression coefficients is determined based on the first vector spectral speech components. A vector of estimated noise components is determined based on the first vector spectral speech components, and a speech section correction factor and a nonspeech section correction factor are calculated from the estimated noise components and the first-vector spectral speech components to produce a combined correction factor. The noise suppression coefficients are weighted by the combined correction factor to produce a vector of post-suppression coefficients. The second vector spectral speech components are weighted by the post-suppression coefficients to produce a vector of enhanced speech components.
    Type: Grant
    Filed: May 30, 2006
    Date of Patent: April 17, 2012
    Assignee: NEC Corporation
    Inventors: Masanori Kato, Akihiko Sugiyama
  • Patent number: 8160889
    Abstract: A bandwidth extension system extends the bandwidth of an acoustic signal. By shifting a portion of the signal by a frequency value, the system generates an upper bandwidth extension signal. An extended bandwidth acoustic signal may be generated from the acoustic signal, the upper bandwidth extension signal, and/or a lower bandwidth extension signal.
    Type: Grant
    Filed: January 17, 2008
    Date of Patent: April 17, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Bernd Iser, Gerhard Nüssle, Gerhard Uwe Schmidt
  • Patent number: 8155971
    Abstract: A method for decoding a multi-audio-object signal having audio signals of first and second types encoded therein, the multi-audio-object signal having a downmix signal and side information having level information of the audio signals of the first and second types in a first predetermined time/frequency resolution, the method including computing a prediction coefficient matrix C based on the level information; and up-mixing the downmix signal based on the prediction coefficients to obtain a first and/or a second up-mix audio signal approximating the audio signals of the first and second types, respectively, wherein up-mixing yields the first and/or second up-mix signals S1 and S2 from the downmix signal d according to a computation representable by ( S 1 S 2 ) = D - 1 ? { ( 1 C ) ? d + H } , with “1” denoting—depending on the number of channels of d—a scalar, or an identity matrix, and D?1 being a matrix uniquely determined by a downmix prescription according
    Type: Grant
    Filed: October 17, 2008
    Date of Patent: April 10, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung e.V.
    Inventors: Oliver Hellmuth, Johannes Hilpert, Leonid Terentiev, Cornelia Falch, Andreas Hoelzer, Juergen Herre
  • Publication number: 20120084079
    Abstract: A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.
    Type: Application
    Filed: November 2, 2011
    Publication date: April 5, 2012
    Applicant: Google Inc.
    Inventors: Alexander Gruenstein, William J. Byrne
  • Publication number: 20120084078
    Abstract: A scalable voice signature authentication capability is provided herein. The scalable voice signature authentication capability enables authentication of varied services such as speaker identification (e.g. private banking and access to healthcare account records), voice signature as a password (e.g. secure access for remote services and document retrieval) and the Internet and its various services (e.g.
    Type: Application
    Filed: September 30, 2010
    Publication date: April 5, 2012
    Applicant: Alcatel-Lucent USA Inc.
    Inventors: Madhav Moganti, Anish Sankalia
  • Patent number: 8150703
    Abstract: Systems are disclosed for operating a communications network. The system includes a module to buffer frames of a signal, and a module to determine an access delay. The system also includes a module to compress a portion of the signal based on the access delay by removing a first portion of a frame of the signal and generating an overlap-added segment from a first segment and a second segment of the frame. In another embodiment, the system includes a module to buffer frames of a signal, a module to establish a communication channel with a handset, and a module to determine an access delay. The system also includes a module to compress a portion of the signal based on the access delay by removing a first portion of a frame of the signal and generating an overlap-added segment from a first segment and a second segment of the frame.
    Type: Grant
    Filed: August 11, 2009
    Date of Patent: April 3, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Richard Vandervoort Cox, David A. Kapilow
  • Patent number: 8145476
    Abstract: A disclosed received voice playback apparatus includes a characteristic acquiring unit configured to acquire first frequency characteristic values obtained by resolving digital vocal signals that are based on received vocal signals into predetermined frequency bands, wherein each first frequency characteristic value corresponds to one of the predetermined frequency bands; a setting unit configured to obtain second frequency characteristic values, wherein each second frequency characteristic value is set for one of the predetermined frequency bands; a computing unit configured to compute a gain for each of the predetermined frequency bands based on a difference between the first frequency characteristic value and the second frequency characteristic value; and a characteristic changing unit configured to change the first frequency characteristic values of the digital vocal signals by multiplying the digital vocal signals by each of the gains corresponding to one of the predetermined frequency bands of the digit
    Type: Grant
    Filed: January 10, 2008
    Date of Patent: March 27, 2012
    Assignee: Ricoh Company, Ltd.
    Inventor: Yukihiro Imai
  • Patent number: 8145274
    Abstract: Systems and methods for automatically setting reminders. A method for automatically setting reminders includes receiving utterances, determining whether the utterances match a stored phrase, and in response to determining that there is a match, automatically setting a reminder in a mobile communication device. Various filters can be applied to determine whether or not to set a reminder. Examples of suitable filters include location, date/time, callee's phone number, etc.
    Type: Grant
    Filed: May 14, 2009
    Date of Patent: March 27, 2012
    Assignee: International Business Machines Corporation
    Inventors: Salil P. Gandhi, Saidas T. Kottawar, Mike V. Macias, Sandip D. Mahajan
  • Patent number: 8144862
    Abstract: A method and apparatus for use in suppressing acoustic echo in a target speech signal being transmitted through a packet-based communications network uses frame energy estimation applied to the target speech signal and to a reference speech signal. The method or apparatus estimates one or more reference speech energy levels in one or more reference packets based on one or more of the speech parameters generated by the speech encoding of the reference signal; estimates a target speech energy level in a target packet based on one or more of the speech parameters generated by the speech encoding of the target signal; compares the target speech energy level to one or more reference speech energy levels; and detects an echo in the target speech signal based on the comparison of the target speech energy level to the one or more reference speech energy levels.
    Type: Grant
    Filed: September 4, 2008
    Date of Patent: March 27, 2012
    Assignee: Alcatel Lucent
    Inventors: Binshi Cao, Doh-Suk Kim, Ahmed A Tarraf
  • Publication number: 20120072206
    Abstract: A terminal apparatus configured to obtain positional information indicating a position of another apparatus; to obtain positional information indicating a position of the terminal apparatus; to obtain a first direction, which is a direction to the obtained position of the another apparatus and calculated using the obtained position of the terminal apparatus; to obtain a second direction, which is a direction in which the terminal apparatus is oriented; to obtain inclination information indicating whether the terminal apparatus is inclined to the right or to the left; to switch an amount of correction for a relative angle between the first direction and the second direction in accordance with whether the obtained inclination information indicates an inclination to the right or an inclination to the left; and to determine an attribute of speech output from a speech output unit in accordance with the relative angle corrected by the amount of correction.
    Type: Application
    Filed: July 27, 2011
    Publication date: March 22, 2012
    Applicant: FUJITSU LIMITED
    Inventors: Yoshiteru Tsuchinaga, Kaori Endo
  • Publication number: 20120072207
    Abstract: Provided are a down-mixing method and an encoder, wherein a high quantization performance can be realized when a balance adjustment operation due to a balance weight coefficient and a removal operation of a main component are combined. In the encoder (100), a down-mixing unit (101) generates a mono signal by multiplying an L-signal and an R-signal by coefficients a and ss, respectively, and summing the L-signal and the R-signal to generate a mono signal. A first encoding target signal, corresponding to the L-signal is generated by multiplying the mono signal by a balance weight coefficient wL and subtracting the same from the L-signal, using a multiplier (107) and an adder (109). A second encoding target signal, corresponding to the R-signal is generated by multiplying the mono signal by a balance weight coefficient wR and subtracting the same from the R-signal, using a multiplier (108) and an adder (110).
    Type: Application
    Filed: June 1, 2010
    Publication date: March 22, 2012
    Applicant: PANASONIC CORPORATION
    Inventor: Toshiyuki Morii