Time Patents (Class 704/211)
  • Patent number: 8340943
    Abstract: Provided is an apparatus of separating a musical sound source, which may re-construct mixed signals into target sound sources and other sound sources directly using sound source information performed using a predetermined musical instrument when the sound source information is present, thereby more effectively separating sound sources included in the mixed signal. The apparatus may include a Nonnegative Matrix Partial Co-Factorization (NMPCF) analysis unit to perform an NMPCF analysis on a mixed signal and a predetermined sound source signal using a sound source separation model, and to obtain a plurality of entity matrices based on the analysis result, and a target instrument signal separating unit to separate, from the mixed signal, a target instrument signal corresponding to the predetermined sound source signal by calculating an inner product between the plurality of entity matrices.
    Type: Grant
    Filed: August 12, 2010
    Date of Patent: December 25, 2012
    Assignees: Electronics and Telecommunications Research Institute, Postech Acadeny-Industry Foundation
    Inventors: Min Je Kim, Seungjin Choi, Jiho Yoo, Kyeongok Kang, Inseon Jang, Jin-Woo Hong
  • Patent number: 8340972
    Abstract: The use of SOLA speech time compression/expansion in the present invention method as a means to alter a speaker's talking rate by adjusting the speech rate at which people hear their own voice. A person speaks at a certain comfort rate, which is established and maintained by their own auditory system's capability to hear their own voice as they speak i.e., it is a self-auditory feedback mechanism. Changing the rate (112) at which a talker hears their own voice (130, 2012, 2024) will accordingly change their talking rate. This effect is achieved in this invention by employing a real time processing method (110, 402-416, FIG. 10) that temporarily adjusts the speech rate in an effort to impose this psychoacoustic condition which coerces the speaker into changing their talking rate. This invention permits users to adjust the comfort rate at which they normally speak (124) or to adjust the rate at which others speak to them through the use of a speech processing device or system.
    Type: Grant
    Filed: June 27, 2003
    Date of Patent: December 25, 2012
    Assignee: Motorola Mobility LLC
    Inventors: Marc Andre Boillot, John Gregory Harris, Thomas Lawrence Reinke
  • Patent number: 8332212
    Abstract: A method and system for improving the efficiency of real-time and non-real-time speech transcription by machine speech recognizers, human dictation typists, and human voicewriters using speech recognizers. In particular, the pacing with which recorded speech is presented to transcriptionists is automatically adjusted by monitoring the transcriptionists' output by comparing the output acoustically or phonetically to the presented recorded speech as well as monitoring the resulting transcription, and accordingly adjusting the pacing.
    Type: Grant
    Filed: June 17, 2009
    Date of Patent: December 11, 2012
    Assignee: Cogi, Inc.
    Inventors: Andreas Wittenstein, Mark Cromack
  • Patent number: 8326613
    Abstract: The present invention relates to a method of synthesizing a signal comprising the steps of determining a required pitch bell locations, mapping the required pitch bell locations onto the signal to provide first pitch bell locations, randomizing the first pitch bell locations to provide second pitch bell locations, windowing the signal on the second pitch bell locations to provide a pitch bell, repeating the aforementioned steps for all required pitch bell locations and performing an overlap and add operation with respect to the pitch bells in order to synthesize the signal.
    Type: Grant
    Filed: August 25, 2010
    Date of Patent: December 4, 2012
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Ercan Ferit Gigi
  • Patent number: 8326621
    Abstract: A system improves the perceptual quality of a speech signal by dampening undesired repetitive transient noises. The system includes a repetitive transient noise detector adapted to detect repetitive transient noise in a received signal. The received signal may include a harmonic and a noise spectrum. The system further includes a repetitive transient noise attenuator that substantially removes or dampens repetitive transient noises from the received signal. The method of dampening the repetitive transient noises includes modeling characteristics of repetitive transient noises; detecting characteristics in the received signal that correspond to the modeled characteristics of the repetitive transient noises; and substantially removing components of the repetitive transient noises from the received signal that correspond to some or all of the modeled characteristics of the repetitive transient noises.
    Type: Grant
    Filed: November 30, 2011
    Date of Patent: December 4, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Phillip A. Hetherington, Shreyas A. Paranjpe
  • Patent number: 8320530
    Abstract: A method for realizing a multimedia call includes the following steps. A call request initiated by a calling terminal is received. An indication of the multimedia negotiation capability of the calling terminal is acquired, in which the indication of the multimedia negotiation capability identifies whether the terminal has the capability of supporting multiple multimedia negotiations or not. It is determined, according to the indication of the multimedia negotiation capability, whether the calling terminal has the capability of supporting multiple multimedia negotiations or not. A multimedia call connection is performed according to the multimedia negotiation capability of the calling terminal. It is determined, according to the factor whether the calling terminal has the capability of supporting multiple multimedia negotiations or not, how to perform the multimedia call connection, so as to flexibly realize the multimedia call connection accordingly.
    Type: Grant
    Filed: May 4, 2009
    Date of Patent: November 27, 2012
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Sichen Wang, Hui Huang, Peng Wang
  • Patent number: 8321210
    Abstract: An apparatus for encoding includes a first domain converter, a switchable bypass, a second domain converter, a first processor and a second processor to obtain an encoded audio signal having different signal portions represented by coded data in different domains, which have been coded by different coding algorithms. Corresponding decoding stages in the decoder together with a bypass for bypassing a domain converter allow the generation of a decoded audio signal with high quality and low bit rate.
    Type: Grant
    Filed: January 14, 2011
    Date of Patent: November 27, 2012
    Assignees: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V., Voiceage Corporation
    Inventors: Bernhard Grill, Stefan Bayer, Guillaume Fuchs, Stefan Geyersberger, Ralf Geiger, Johannes Hilpert, Ulrich Kraemer, Jeremie Lecomte, Markus Multrus, Max Neuendorf, Harald Popp, Nikolaus Rettelbach, Roch Lefebvre, Bruno Bessette, Jimmy Lapierre, Philippe Gournay, Redwan Salami
  • Publication number: 20120296642
    Abstract: A method and apparatus for speech analysis, comprising detecting an at least one temporal characteristic of an at least one speech of an at least one speaker, and deducing an at least one quantitative score from the at least one temporal characteristic, where the at least one quantitative score indicates an at least one extent of an at least one behavioral aspect of the at least one speaker.
    Type: Application
    Filed: May 19, 2011
    Publication date: November 22, 2012
    Applicant: Nice Systems Ltd.
    Inventors: Sherrie SHAMMASS, Moshe Wasserblat, Oren Lewkowicz, Liron Aichel, Oded Kalchiem, Ishay Levi, Ronit Ephrat, Adee Lavi, Lior Hadaya
  • Patent number: 8315862
    Abstract: An audio signal quality enhancement apparatus and method. The apparatus includes a pitch calculating unit to extract a pitch period of an audio signal, a frequency domain transforming unit to transform the audio signal to a frequency domain, a frequency band dividing unit to classify the transformed audio signal into audio signals for each of the plurality of frequency bands based on the extracted pitch period, and a pitch enhancement unit to determine a gain based on a volume of the transformed audio signal, and to generate an output signal by multiplying each of the classified audio signals with respect to each of the plurality of frequency bands by the gain, thereby enhancing quality of the audio signal.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: November 20, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jung Hoe Kim, Ho Chong Park, Eun Mi Oh
  • Patent number: 8315857
    Abstract: Systems and methods for modification of an audio input signal are provided. In exemplary embodiments, an adaptive multiple-model optimizer is configured to generate at least one source model parameter for facilitating modification of an analyzed signal. The adaptive multiple-model optimizer comprises a segment grouping engine and a source grouping engine. The segment grouping engine is configured to group simultaneous feature segments to generate at least one segment model. The at least one segment model is used by the source grouping engine to generate at least one source model, which comprises the at least one source model parameter. Control signals for modification of the analyzed signal may then be generated based on the at least one source model parameter.
    Type: Grant
    Filed: May 30, 2006
    Date of Patent: November 20, 2012
    Assignee: Audience, Inc.
    Inventors: David Klein, Stephen Malinowski, Lloyd Watts, Bernard Mont-Reynaud
  • Publication number: 20120263302
    Abstract: A voice transmission apparatus is provided. The voice transmission apparatus includes a voice input unit, an Analog-to-Digital (AD) converter, a transmission synchronization unit, a channel status determination unit, and a transmission unit. The voice input unit receives an analog voice signal corresponding to a communication session. The AD converter converts the analog voice signal into a digital signal. The transmission synchronization unit generates a transmission target signal by respectively assigning a start identifier and an end identifier to the first frame and final frame of the digital signal corresponding to the communication session. The channel status determination unit determines the status of a channel. The transmission unit transmits the transmission target signal over the channel in such a way that the transmission rate of the transmission target signal is set on a communication session basis depending on the status of the channel.
    Type: Application
    Filed: April 6, 2012
    Publication date: October 18, 2012
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Young-Ho SON, Jang-Hong YOON
  • Publication number: 20120265524
    Abstract: A method and apparatus are provided for visualizing the latency in a conversation between a local speaker and at least one remote speaker separated from the local speaker by a communication medium. A latency estimate is obtained. A timing indication of at least the end of a conversational turn by the local speaker is obtained, and an outbound graphic is displayed, indicating the progress of at least the end-of-turn across the communication medium toward the remote speaker. The outbound graphical indication is displayed with a transit time across the medium that is derived from the latency estimate. An inbound graphic is displayed, indicating the progress across the communication medium toward the local speaker, of a start of a conversational turn by the remote speaker, which is imputed to begin when the remote speaker receives the local speaker's end-of-turn. The inbound graphical indication is displayed with a transit time across the medium that is derived from the latency estimate.
    Type: Application
    Filed: April 12, 2011
    Publication date: October 18, 2012
    Applicant: Alcatel-Lucent USA Inc.
    Inventor: James W. McGowan
  • Publication number: 20120265525
    Abstract: In encoding, pitch periods for time series signals in a predetermined time interval are calculated, and a code corresponding thereto is output. In that encoding, the resolutions for expressing the pitch periods and/or a pitch period encoding mode are switched according to whether an index indicating a periodicity and/or stationarity level of the time series signals satisfies a condition indicating high or low in periodicity and/or stationarity. In that decoding, according to whether an index indicating a periodicity and/or stationarity level, the index being included in or obtained from an input code corresponding to the predetermined time interval, satisfies a condition indicating high periodicity and/or stationarity, a decoding mode for a code, included in the input code, corresponding to pitch periods is switched to decode the code corresponding to the pitch periods to obtain the pitch periods corresponding to the predetermined time interval.
    Type: Application
    Filed: January 7, 2011
    Publication date: October 18, 2012
    Applicant: Nippon Telegraph and Telephone Corporation
    Inventors: Takehiro Moriya, Noboru Harada, Yutaka Kamamoto
  • Patent number: 8290770
    Abstract: Provided are a method and apparatus for sinusoidal audio coding, which employs a tracking method for further effective coding of sinusoids extracted in the process of a sinusoidal analysis of parametric coding. The sinusoidal audio coding method includes: extracting sinusoids of a current frame by performing a sinusoidal analysis on an input audio signal; with respect to each of the extracted sinusoids, setting a mode selected from a birth mode in which a sinusoid is newly generated irrespective of sinusoids of a previous frame, a continuation mode in which the sinusoid is only one sinusoid continued from one of the sinusoids of the previous frame, and a branch mode in which the sinusoid is one of a plurality of sinusoids continued from one of the sinusoids of the previous frame; and coding the extracted sinusoids according to the selected mode. Accordingly, a plurality of sinusoids that can be continued from one previous track component are set to the continuation mode or the branch mode.
    Type: Grant
    Filed: February 5, 2008
    Date of Patent: October 16, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Nam-suk Lee, Geon-hyoung Lee, Jae-one Oh, Chul-woo Lee, Jong-hoon Jeong
  • Patent number: 8280729
    Abstract: Methods, and corresponding codec-containing devices are provided that have source coding schemes for encoding a component of an excitation. In some cases, the source coding scheme is an enumerative source coding scheme, while in other cases the source coding scheme is an arithmetic source coding scheme. In some cases, the source coding schemes are applied to encode a fixed codebook component of the excitation for a codec employing codebook excited linear prediction, for example an AMR-WB (Adaptive Multi-Rate-Wideband) speech codec.
    Type: Grant
    Filed: January 22, 2010
    Date of Patent: October 2, 2012
    Assignee: Research In Motion Limited
    Inventors: Xiang Yu, Dake He, En-hui Yang
  • Publication number: 20120245929
    Abstract: In an audio output terminal device, a buffer control unit adjusts the buffer size of a jitter buffer in accordance with the setting of a sound output mode instructed in an instruction receiving unit. If the instruction receiving unit acknowledges an instruction for setting an audio output mode that requires low delay in outputting sound, the buffer control unit reduces the buffer size of the jitter buffer. Further, the buffer control unit controls, in accordance with the instructed setting of the sound output mode, timing for allowing a media buffer to transmit one or more voice packets to the jitter buffer.
    Type: Application
    Filed: September 16, 2010
    Publication date: September 27, 2012
    Applicant: SONY COMPUTER ENTERTAINMENT INC.
    Inventors: Kiyoto Shibuya, Jin Nakamura, Katsuhiko Shibata, Kazuhiro Yanase, Akitoshi Yamaguchi, Akiyoshi Morita, Kouichi Kazama
  • Patent number: 8275603
    Abstract: A speech translating apparatus includes a input unit, a speech recognizing unit, a translating unit, a first dividing unit, a second dividing unit, an associating unit, and an outputting unit. The input unit inputs a speech in a first language. The speech recognizing unit generates a first text from the speech. The translating unit translates the first text into a second language and generates a second text. The first dividing unit divides the first text and generates first phrases. The second dividing unit divides the second text and generates second phrases. The associating unit associates semantically equivalent phrases within each group of phrases. The outputting unit sequentially outputs the associated phrases in a phrase order within the second text.
    Type: Grant
    Filed: September 4, 2007
    Date of Patent: September 25, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kentaro Furihata, Tetsuro Chino, Satoshi Kamatani
  • Patent number: 8271279
    Abstract: A speech enhancement system improves the perceptual quality of a processed voice signal. The system improves the perceptual quality of a voice signal by removing unwanted noise components from a voice signal. The system removes undesirable signals that may result in the loss of information. The system receives and analyzes signals to determine whether an undesired random or persistent signal corresponds to one or more modeled noises. When one or more noise components are detected, the noise components are substantially removed or dampened from the signal to provide a less noisy voice signal.
    Type: Grant
    Filed: November 30, 2006
    Date of Patent: September 18, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Phillip A. Hetherington, Shreyas A. Paranjpe
  • Patent number: 8271293
    Abstract: Provided are, among other things, systems, methods and techniques for decoding an audio signal from a frame-based bit stream. Each frame includes processing information pertaining to the frame and entropy-encoded quantization indexes representing audio data within the frame. The processing information includes: (i) code book indexes, (ii) code book application information specifying ranges of entropy-encoded quantization indexes to which the code books are to be applied, and (iii) window information. The entropy-encoded quantization indexes are decoded by applying the identified code books to the corresponding ranges of entropy-encoded quantization indexes. Subband samples are then generated by dequantizing the decoded quantization indexes, and a sequence of different window functions that were applied within a single frame of the audio data is identified based on the window information.
    Type: Grant
    Filed: March 28, 2011
    Date of Patent: September 18, 2012
    Assignee: Digital Rise Technology Co., Ltd.
    Inventor: Yuli You
  • Patent number: 8271275
    Abstract: A scalable encoding device capable of reducing an encoding rate to reduce a circuit scale while preventing sound quality deterioration of a decoded signal. An extension layer is coarsely divided into a system for processing a first channel and a system for processing a second channel. A sound source predictor for processing the first channel predicts a drive sound source signal of the first channel from a drive sound source signal of a monaural signal, and outputs the predicted drive sound source signal through a multiplier to a first CELP encoder. A sound source predictor for processing the second channel predicts the drive sound source signal of the second channel from the drive sound source signal of the monaural signal and the output from the first CELP encoder, and outputs the predicted drive sound source signal through a multiplier to a second CELP encoder.
    Type: Grant
    Filed: May 29, 2006
    Date of Patent: September 18, 2012
    Assignee: Panasonic Corporation
    Inventors: Michiyo Goto, Koji Yoshida
  • Publication number: 20120221327
    Abstract: A method, a device and a system for voice encoding/decoding are disclosed in the present invention. The method includes: assembling an input pulse code modulation signal into one signal according to a designated time slot and assembly manner; and encoding the assembled signal according to a designated encoding manner to output an encoded voice signal. In the present invention, because a process of assembling or splitting the signal may be implemented through software, in the case that hardware in a current network does not need to be replaced, an effect of encoding/decoding voice with a 7 K spectrum may be achieved in the current network.
    Type: Application
    Filed: May 4, 2012
    Publication date: August 30, 2012
    Applicant: Huawei Technologies Co., Ltd.
    Inventors: Xiaoshuang Li, Xingguo Gao
  • Patent number: 8255206
    Abstract: The voice mixing method includes a first step for selecting voice information from a plurality of voice information, a second step for adding up all the selected voice information, a third step for obtaining a voice signal totaling the voice signals other than one voice signal, of the selected voice signals, a fourth step for encoding the voice information obtained in the second step, a fifth step for encoding the voice signal obtained in the third step, and a sixth step for copying the encoded information obtained in the fourth step into the encoded information in the fifth step.
    Type: Grant
    Filed: August 28, 2007
    Date of Patent: August 28, 2012
    Assignee: NEC Corporation
    Inventors: Hironori Ito, Kazunori Ozawa
  • Publication number: 20120215528
    Abstract: Provided is a speech recognition system, including: a first information processing device including a speech recognition processing unit for receiving data to be used for speech recognition transmitted via a network, carrying out speech recognition processing, and returning resultant data; and a second information processing device connected to the first information processing device via the network. The second information processing device performs conversion of the data into data having a format that disables a content thereof from being perceived and also enables the speech recognition processing unit to perform the speech recognition processing. Thereafter, the second information processing device transmits the data to be used for the speech recognition by the speech recognition processing unit and constructs resultant data returned from the first information processing device into a content of a valid and perceivable recognition result.
    Type: Application
    Filed: October 12, 2010
    Publication date: August 23, 2012
    Applicant: NEC CORPORATION
    Inventor: Kentaro Nagatomo
  • Patent number: 8244538
    Abstract: A system evaluates a hands free communication system. The system automatically selects a consonant-vowel-consonant (CVC), vowel-consonant-vowel (VCV), or other combination of sounds from an intelligent database. The selection is transmitted with another communication stream that temporally overlaps the selection. The quality of the communication system is evaluated through an automatic speech recognition engine. The evaluation occurs at a location remote from the transmitted selection.
    Type: Grant
    Filed: April 29, 2009
    Date of Patent: August 14, 2012
    Assignee: QNX Software Systems Limited
    Inventors: Shreyas Paranjpe, Mark Fallat
  • Patent number: 8244537
    Abstract: Video signal relative to an imaged audience and audio signal according to voices from the audience are generated in an input unit. A characteristic amount detection unit detects information on a movement amount, movement periodicity, a volume, voice periodicity of the audience, and a frequency component of voices from the audience based on the video signal or the audio signal. An estimation unit estimates an audience state based on the detected result. An output unit outputs the estimated result of the audience state. The audience state can be easily estimated without observing the audience state by a person.
    Type: Grant
    Filed: May 13, 2008
    Date of Patent: August 14, 2012
    Assignee: Sony Corporation
    Inventors: Tetsujiro Kondo, Yuji Okumura, Koichi Fujishima, Tomoyuki Ohtsuki
  • Patent number: 8239190
    Abstract: A method of communicating speech comprising time-warping a residual low band speech signal to an expanded or compressed version of the residual low band speech signal, time-warping a high band speech signal to an expanded or compressed version of the high band speech signal, and merging the time-warped low band and high band speech signals to give an entire time-warped speech signal. In the low band, the residual low band speech signal is synthesized after time-warping of the residual low band signal while in the high band, an unwarped high band signal is synthesized before time-warping of the high band speech signal. The method may further comprise classifying speech segments and encoding the speech segments. The encoding of the speech segments may be one of code-excited linear prediction, noise-excited linear prediction or ? frame (silence) coding.
    Type: Grant
    Filed: August 22, 2006
    Date of Patent: August 7, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Rohit Kapoor, Serafin Diaz Spindola
  • Patent number: 8234411
    Abstract: Methods, systems, computer readable media, and apparatuses for providing enhanced content are presented. Data including a first program, a first caption stream associated with the first program, and a second caption stream associated with the first program may be received. The second caption stream may be extracted from the data, and a second program may be encoded with the second caption stream. The first program may be transmitted with the first caption stream including first captions and may include first content configured to be played back at a first speed. In response to receiving an instruction to change play back speed, the second program may be transmitted with the second caption stream. The second program may include the first content configured to be played back at a second speed different from the first speed, and the second caption stream may include second captions different from the first captions.
    Type: Grant
    Filed: September 2, 2010
    Date of Patent: July 31, 2012
    Assignee: Comcast Cable Communications, LLC
    Inventor: Ross Gilson
  • Patent number: 8223269
    Abstract: In a closed caption production device, video recognition processing of an input video signal is performed by a video recognizer. This causes a working object in video to be recognized. In addition, a sound recognizer performs sound recognition processing of an input sound signal. This causes a position of a sound source to be estimated. A controller performs linking processing by comparing information of the working object recognized by the video recognition processing with positional information of the sound source estimated by the sound recognition processing. This causes a position of a closed caption produced based on the sound signal to be set in the vicinity of the working object in the video.
    Type: Grant
    Filed: September 19, 2007
    Date of Patent: July 17, 2012
    Assignee: Panasonic Corporation
    Inventor: Isao Ikegami
  • Patent number: 8223136
    Abstract: A stream of raw acoustic data can be received at a client device. The client device can frame the stream of raw acoustic data at particular intervals with alignment information to create framed acoustic data, and buffer the framed acoustic data while waiting for a data request from a host device. In response to receiving the data request, the client device can provide the framed acoustic data to the host device.
    Type: Grant
    Filed: June 7, 2005
    Date of Patent: July 17, 2012
    Assignee: Intel Corporation
    Inventors: Yongge Hu, Ying Jia
  • Publication number: 20120179460
    Abstract: A method for testing an automated interactive media system. The method can include establishing a communication session with the automated interactive media system. In response to receiving control and/or media information from the automated interactive media system, pre-recorded control and/or media information can be propagated to the automated interactive media system. The pre-recorded control and/or media information can be recorded in real time.
    Type: Application
    Filed: March 17, 2012
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: WILLIAM V. DA PALMA, BRIEN H. MUSCHETT
  • Patent number: 8219409
    Abstract: An encoder/decoder for multi-channel audio data, and in particular for audio reproduction through wave field synthesis. The encoder comprises a two-dimensional filter-bank to the multi-channel signal, in which the channel index is treated as an independent variable as well as time, and and the resulting spectral coefficient are quantized according to a two-dimensional psychoacoustic model, including masking effect in the spatial frequency as well as in the temporal frequency. The coded spectral data are organized in a bitstream together with side information containing scale factors and Huffman codebook identifiers.
    Type: Grant
    Filed: March 31, 2008
    Date of Patent: July 10, 2012
    Assignee: Ecole Polytechnique Federale De Lausanne
    Inventors: Martin Vetterli, Francisco Pereira Correia Pinto
  • Patent number: 8214202
    Abstract: An audio/speech sender and an audio/speech receiver and methods thereof. The audio/speech sender comprising a core encoder adapted to encode a core frequency band of an input audio/speech signal having a first sampling frequency, wherein the core frequency band comprises frequencies up to a cut-off frequency. The audio/speech sender further comprises a segmentation device adapted to perform a segmentation of the input audio/speech signal into a plurality of segments, a cut-off frequency estimator adapted to estimate a cut-off frequency for each segment and adapted to transmit information about the estimated cut-off frequency to a decoder, a low-pass filter adapted to filter each segment at said estimated cut-off frequency, and a re-sampler adapted to resample the filtered segments with a second sampling frequency that is related to said cut-off frequency in order to generate an audio/speech frame to be encoded by said core encoder.
    Type: Grant
    Filed: September 13, 2006
    Date of Patent: July 3, 2012
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventor: Stefan Bruhn
  • Patent number: 8212922
    Abstract: An information display apparatus includes a display device configured to display a video, a speech detection unit configured to detect a playback state of a playback speech, a closed caption display unit configured to generate character information associated with the playback speech and display it on the display device together with the video, and a closed caption display unit configured to carry out a changing control for changing according to the detected playback state a display state of the character information that is displayed on the display device by the closed caption display unit.
    Type: Grant
    Filed: September 26, 2007
    Date of Patent: July 3, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kohei Momosaki, Kazuhiko Abe, Yasuyuki Masai, Makoto Yajima, Koichi Yamamoto, Munehiko Sasajima
  • Patent number: 8214219
    Abstract: A speech communications system for a vehicle includes a microphone system provided in the vehicle interior in order to detect audio information. An interaction manager provides grammar information to a speech recognizer. The speech recognizer provides speech recognition results to the interaction manager. An acoustic echo canceller eliminates portions of the audio information detected by the microphone system. A sound localizer determines a sound source location in the vehicle interior. A method of operating a speech communications system in a vehicle is also provided. An interruptible text-to-speech operation provides a speech output to a user. Voice information is requested from the user for a maximum number of times if insufficient voice information or no voice information is provided in response to the speech output provided by the interruptible text-to-speech operation. The dialog context of an unfinished speech interaction is saved.
    Type: Grant
    Filed: September 15, 2006
    Date of Patent: July 3, 2012
    Assignee: Volkswagen of America, Inc.
    Inventors: Ramon Prieto, Rohit Mishra
  • Publication number: 20120162512
    Abstract: Systems, methods, and processor readable media are disclosed for encoding and transmitting first media content and second media content using a digital radio broadcast system, such that the second media content can be rendered in synchronization with the first media content by a digital radio broadcast receiver. The disclosed systems, methods, and processor-readable media determine when a receiver will render audio and data content that is transmitted at a given time by the digital radio broadcast transmitter, and adjust the media content accordingly to provide synchronized rendering. In exemplary embodiments, these adjustments can be provided by: 1) inserting timing instructions specifying playback time in the secondary content based on calculated delays; or 2) controlling the timing of sending the primary or secondary content to the transmitter so that it will be rendered in synchronization by the receiver.
    Type: Application
    Filed: February 21, 2012
    Publication date: June 28, 2012
    Applicant: iBiquity Digital Corporation
    Inventors: Steven Andrew Johnson, Muthu Gopal Balasubramanian, Harvey Chalmers, Jeffrey Ranken Detweiler, Albert John Gambardella, Russell Iannuzzelli, Stephen Douglas Mattson
  • Patent number: 8209182
    Abstract: An emotion recognition system for assessing human emotional behavior from communication by a speaker includes a processing system configured to receive signals representative of the verbal and/or non-verbal communication. The processing system derives signal features from the received signals. The processing system is further configured to implement at least one intermediate mapping between the signal features and one or more elements of an emotional ontology in order to perform an emotion recognition decision. The emotional ontology provides a gradient representation of the human emotional behavior.
    Type: Grant
    Filed: November 30, 2006
    Date of Patent: June 26, 2012
    Assignee: University of Southern California
    Inventor: Shrikanth S. Narayanan
  • Patent number: 8209168
    Abstract: An audio data transmitting/receiving apparatus for realizing a high-quality frame compensation in audio communications. In an audio data transmitting apparatus (10), a delay part (104) subjects multi-channel audio data to a delay process that delays the L-ch encoded data relative to the R-ch encoded data by a predetermined delay amount. A multiplexing part (106) multiplexes the audio data as subjected to the delay process. A transmitting part (108) transmits the audio data as multiplexed. In an audio data receiving apparatus (20), a separating part (114) separates, for each channel, the audio data received from the audio data transmitting apparatus (10). A decoding part (118) decodes, for each channel, the audio data as separated. If there has occurred a loss or error in the audio data as separated, then a frame compensating part (120) uses one of the L-ch and R-ch encoded data to compensate for the loss or error in the other encoded data.
    Type: Grant
    Filed: May 20, 2005
    Date of Patent: June 26, 2012
    Assignee: Panasonic Corporation
    Inventor: Koji Yoshida
  • Publication number: 20120158402
    Abstract: A system and method for automatically adjusting floor controls based on conversational characteristics is provided. Audio streams are received, which each originate from an audio source. Floor controls for a current configuration including at least a portion of the audio streams are maintained. Conversational characteristics shared by two or more of the audio sources are determined. Possible configurations for the audio streams are identified based on the conversational characteristics. An analysis of the current configuration and the possible configurations is performed. A change threshold comprising a minimum number of timeslices for at least one of the current configuration and one of the possible configurations is applied to the analysis. When the analysis satisfies the change threshold, the floor controls are automatically adjusted. The audio streams are mixed into one or more outputs based on the adjusted floor controls.
    Type: Application
    Filed: February 27, 2012
    Publication date: June 21, 2012
    Applicant: PALO ALTO RESEARCH CENTER INCORPORATED
    Inventors: Paul M. Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison G. Woodruff
  • Publication number: 20120158403
    Abstract: A voice reproduction apparatus includes an ambient sound analysis unit to analyze a characteristic of an ambient sound, a characteristic analysis unit to analyze an acoustic characteristic of a signal for reproduction, a reproduction timing adjusting unit to record the signal for reproduction and to read the signal for reproduction at a reproduction timing of follow-up reproduction, a reproduction speed changing unit to change a reproduction speed of the read signal for reproduction, and a control unit to control the reproduction timing adjusting unit so that the signal for reproduction is reproduced at the reproduction timing corresponding to an analysis result of the ambient sound analysis unit and to control the reproduction speed changing unit so that the signal for reproduction is reproduced at the reproduction speed corresponding to the analysis result of the ambient sound analysis unit and the acoustic characteristic obtained by the characteristic analysis unit.
    Type: Application
    Filed: March 1, 2012
    Publication date: June 21, 2012
    Applicant: FUJITSU LIMITED
    Inventors: Taro TOGAWA, Takeshi Otani, Kaori Endo, Yasuji Ota
  • Patent number: 8200479
    Abstract: Methods and mobile devices are provided for asymmetric independent processing of audio streams in a system on a chip (SOC). More specifically, independent audio paths are provided for processors performing audio processing on the SOC and mixing of decoded audio samples from the processors is performed digitally on the SOC by a hardware digital mixer.
    Type: Grant
    Filed: December 23, 2008
    Date of Patent: June 12, 2012
    Assignee: Texas Instruments Incorporated
    Inventors: Stephane Sintes, Franck Seigneret, Christophe Favergeon-Borgialli
  • Publication number: 20120143600
    Abstract: In a speech synthesis information editing apparatus, a phoneme storage unit stores phoneme information that designates a duration of each phoneme of speech to be synthesized. A feature storage unit stores feature information that designates a time variation in a feature of the speech. An edition processing unit changes a duration of each phoneme designated by the phoneme information with an expansion/compression degree depending on a feature designated by the feature information in correspondence to the phoneme.
    Type: Application
    Filed: December 1, 2011
    Publication date: June 7, 2012
    Applicant: Yamaha Corporation
    Inventor: Tatsuya IRIYAMA
  • Patent number: 8195451
    Abstract: In an information detecting apparatus (1), a speech kind discrimination unit (11) discriminates and classifies an audio signal at an information source into kind (category) such as music or speech, etc. on a predetermined time basis, and a memory unit/recording medium (13) records discrimination information thereof. A discrimination frequency calculating unit (15) calculates, on a predetermined time basis, discrimination frequency every kind at a predetermined time period longer than the time unit.
    Type: Grant
    Filed: February 10, 2004
    Date of Patent: June 5, 2012
    Assignee: Sony Corporation
    Inventor: Yasuhiro Toguri
  • Patent number: 8195449
    Abstract: A non-intrusive signal quality assessment apparatus includes a feature vector calculator that determines parameters representing frames of a signal and extracts a collection of per-frame feature vectors (?;(n)) representing structural information of the signal from the parameters. A frame selector preferably selects only frames (?\with a feature vector (?;(n)) lying within a predetermined multi-dimensional window (?). Means determine a global feature set (?) over the collection of feature vectors (?;(n)) from statistical moments of selected feature vector components ((1^,02, . . . O11). A quality predictor predicts a signal quality measure (Qj from the global feature set (?)).
    Type: Grant
    Filed: January 30, 2007
    Date of Patent: June 5, 2012
    Assignee: Telefonaktiebolaget L M Ericsson (Publ)
    Inventors: Stefan Bruhn, Volodya Grancharov, Willem Bastiaan Kleijn
  • Patent number: 8165888
    Abstract: Disclosed is a reproducing apparatus comprising: a reproduction section to reproduce reproduction data comprising sound data and/or image data; a selection section to calculate evaluation values between a link source set for the reproduction data and each of a plurality of link destinations corresponding to the link source by a predetermined arithmetic expression based on link information of the plurality of link destinations, and to select a link destination having a highest evaluation among the evaluation values out of the plurality of link destinations; and a reproduction control section to move a reproduction point of the reproduction data reproduced by the reproduction section to a position corresponding to the link destination by linking the link source with the link destination when the reproduction point reaches a given point with respect to a position corresponding to the link source, and to instruct the reproduction section to reproduce the reproduction data.
    Type: Grant
    Filed: March 14, 2008
    Date of Patent: April 24, 2012
    Assignees: The University of Electro-Communications, Funai Electric Co., Ltd.
    Inventors: Kota Takahashi, Yasuo Masaki
  • Patent number: 8165882
    Abstract: Apparatus and method for generating high quality synthesized speech having smooth waveform concatenation. The apparatus includes a pitch frequency calculation section, a pitch synchronization position calculation section, a unit waveform storage, a unit waveform selection section, a unit waveform generation section, and a waveform synthesis section. The unit waveform generation section includes a conversion ratio calculation section, a sampling rate conversion section, and a unit waveform re-selection section. The conversion ratio calculation section calculates a sampling rate conversion ratio from the pitch information and the position of pitch synchronization, and the sampling rate conversion section converts the sampling rate of the unit waveform, delivered as input, based on the sampling rate conversion ratio.
    Type: Grant
    Filed: September 4, 2006
    Date of Patent: April 24, 2012
    Assignee: NEC Corporation
    Inventors: Masanori Kato, Satoshi Tsukada
  • Patent number: 8160890
    Abstract: It possible not only to reduce a delay, but also to enhance the coding efficiency and reduce audio artifact upon coding.
    Type: Grant
    Filed: December 5, 2007
    Date of Patent: April 17, 2012
    Assignee: Panasonic Corporation
    Inventors: Mineo Tsushima, Akihisa Kawamura
  • Patent number: 8155965
    Abstract: In one embodiment, the present invention comprises a vocoder having at least one input and at least one output, an encoder comprising a filter having at least one input operably connected to the input of the vocoder and at least one output, a decoder comprising a synthesizer having at least one input operably connected to the at least one output of the encoder, and at least one output operably connected to the at least one output of the vocoder, wherein the encoder comprises a memory and the encoder is adapted to execute instructions stored in the memory comprising classifying speech segments and encoding speech segments, and the decoder comprises a memory and the decoder is adapted to execute instructions stored in the memory comprising time-warping a residual speech signal to an expanded or compressed version of the residual speech signal.
    Type: Grant
    Filed: May 5, 2005
    Date of Patent: April 10, 2012
    Assignee: QUALCOMM Incorporated
    Inventors: Rohit Kapoor, Serafin Diaz Spindola
  • Patent number: 8155972
    Abstract: This invention involves time-scale modification of audio signals. The invention describes overlap and add time scale modification with variable input and output buffer sizes. Seamless speed change is achieved by keeping track of previously processed data to avoid discontinuities during playback speed transitions.
    Type: Grant
    Filed: October 5, 2005
    Date of Patent: April 10, 2012
    Assignee: Texas Instruments Incorporated
    Inventors: Atsuhiro Sakurai, Yoshihide Iwata
  • Patent number: 8155954
    Abstract: A filter bank device for generating a complex spectral representation of a discrete-time signal includes a generator for generating a block-wise real spectral representation, which, for example, implements an MDCT, to obtain temporally successive blocks of real spectral coefficients. The output values of this spectral conversion device are fed to a post-processor for post-processing the block-wise real spectral representation to obtain an approximated complex spectral representation having successive blocks, each block having a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and by a second partial spectral coefficient, wherein at least one of the first and second partial spectral coefficients is determined by combining at least two real spectral coefficients.
    Type: Grant
    Filed: March 4, 2010
    Date of Patent: April 10, 2012
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Bernd Edler, Stefan Geyersberger
  • Publication number: 20120084081
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing trend analysis of speech. A system practicing the method receives a speech trend analysis request having candidate feature constraints, an objective function with respect to a speech trend to be analyzed, and a set of speech record constraints. The system selects a subset of speech records from the group of speech records based on the set of speech record constraints to yield selected speech records, identifies features in the selected speech records based on the set of candidate feature constraints to yield identified features, and assigns a weight to each of the identified features based on the objective function. Then the system ranks the identified features by their respective weights to yield ranked identified features, and outputs at least one of the ranked identified features associated with a speech-based trend in response to the speech trend analysis request.
    Type: Application
    Filed: September 30, 2010
    Publication date: April 5, 2012
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: ILYA Dan MELAMED, Mazin Gilbert