Time Patents (Class 704/211)

Pulse code modulation (pcm) (Class 704/212)

Zero crossing (Class 704/213)

Voiced or unvoiced (Class 704/214)

Silence decision (Class 704/215)

Correlation function (Class 704/216)

Method and system for separating musical sound source

Patent number: 8340943

Abstract: Provided is an apparatus of separating a musical sound source, which may re-construct mixed signals into target sound sources and other sound sources directly using sound source information performed using a predetermined musical instrument when the sound source information is present, thereby more effectively separating sound sources included in the mixed signal. The apparatus may include a Nonnegative Matrix Partial Co-Factorization (NMPCF) analysis unit to perform an NMPCF analysis on a mixed signal and a predetermined sound source signal using a sound source separation model, and to obtain a plurality of entity matrices based on the analysis result, and a target instrument signal separating unit to separate, from the mixed signal, a target instrument signal corresponding to the predetermined sound source signal by calculating an inner product between the plurality of entity matrices.

Type: Grant

Filed: August 12, 2010

Date of Patent: December 25, 2012

Assignees: Electronics and Telecommunications Research Institute, Postech Acadeny-Industry Foundation

Inventors: Min Je Kim, Seungjin Choi, Jiho Yoo, Kyeongok Kang, Inseon Jang, Jin-Woo Hong
Psychoacoustic method and system to impose a preferred talking rate through auditory feedback rate adjustment

Patent number: 8340972

Abstract: The use of SOLA speech time compression/expansion in the present invention method as a means to alter a speaker's talking rate by adjusting the speech rate at which people hear their own voice. A person speaks at a certain comfort rate, which is established and maintained by their own auditory system's capability to hear their own voice as they speak i.e., it is a self-auditory feedback mechanism. Changing the rate (112) at which a talker hears their own voice (130, 2012, 2024) will accordingly change their talking rate. This effect is achieved in this invention by employing a real time processing method (110, 402-416, FIG. 10) that temporarily adjusts the speech rate in an effort to impose this psychoacoustic condition which coerces the speaker into changing their talking rate. This invention permits users to adjust the comfort rate at which they normally speak (124) or to adjust the rate at which others speak to them through the use of a speech processing device or system.

Type: Grant

Filed: June 27, 2003

Date of Patent: December 25, 2012

Assignee: Motorola Mobility LLC

Inventors: Marc Andre Boillot, John Gregory Harris, Thomas Lawrence Reinke
Method and system for efficient pacing of speech for transcription

Patent number: 8332212

Abstract: A method and system for improving the efficiency of real-time and non-real-time speech transcription by machine speech recognizers, human dictation typists, and human voicewriters using speech recognizers. In particular, the pacing with which recorded speech is presented to transcriptionists is automatically adjusted by monitoring the transcriptionists' output by comparing the output acoustically or phonetically to the presented recorded speech as well as monitoring the resulting transcription, and accordingly adjusting the pacing.

Type: Grant

Filed: June 17, 2009

Date of Patent: December 11, 2012

Assignee: Cogi, Inc.

Inventors: Andreas Wittenstein, Mark Cromack
Method of synthesizing of an unvoiced speech signal

Patent number: 8326613

Abstract: The present invention relates to a method of synthesizing a signal comprising the steps of determining a required pitch bell locations, mapping the required pitch bell locations onto the signal to provide first pitch bell locations, randomizing the first pitch bell locations to provide second pitch bell locations, windowing the signal on the second pitch bell locations to provide a pitch bell, repeating the aforementioned steps for all required pitch bell locations and performing an overlap and add operation with respect to the pitch bells in order to synthesize the signal.

Type: Grant

Filed: August 25, 2010

Date of Patent: December 4, 2012

Assignee: Koninklijke Philips Electronics N.V.

Inventor: Ercan Ferit Gigi
Repetitive transient noise removal

Patent number: 8326621

Abstract: A system improves the perceptual quality of a speech signal by dampening undesired repetitive transient noises. The system includes a repetitive transient noise detector adapted to detect repetitive transient noise in a received signal. The received signal may include a harmonic and a noise spectrum. The system further includes a repetitive transient noise attenuator that substantially removes or dampens repetitive transient noises from the received signal. The method of dampening the repetitive transient noises includes modeling characteristics of repetitive transient noises; detecting characteristics in the received signal that correspond to the modeled characteristics of the repetitive transient noises; and substantially removing components of the repetitive transient noises from the received signal that correspond to some or all of the modeled characteristics of the repetitive transient noises.

Type: Grant

Filed: November 30, 2011

Date of Patent: December 4, 2012

Assignee: QNX Software Systems Limited

Inventors: Phillip A. Hetherington, Shreyas A. Paranjpe
Method, apparatus and system for realizing a multimedia call

Patent number: 8320530

Abstract: A method for realizing a multimedia call includes the following steps. A call request initiated by a calling terminal is received. An indication of the multimedia negotiation capability of the calling terminal is acquired, in which the indication of the multimedia negotiation capability identifies whether the terminal has the capability of supporting multiple multimedia negotiations or not. It is determined, according to the indication of the multimedia negotiation capability, whether the calling terminal has the capability of supporting multiple multimedia negotiations or not. A multimedia call connection is performed according to the multimedia negotiation capability of the calling terminal. It is determined, according to the factor whether the calling terminal has the capability of supporting multiple multimedia negotiations or not, how to perform the multimedia call connection, so as to flexibly realize the multimedia call connection accordingly.

Type: Grant

Filed: May 4, 2009

Date of Patent: November 27, 2012

Assignee: Huawei Technologies Co., Ltd.

Inventors: Sichen Wang, Hui Huang, Peng Wang
Audio encoding/decoding scheme having a switchable bypass

Patent number: 8321210

Abstract: An apparatus for encoding includes a first domain converter, a switchable bypass, a second domain converter, a first processor and a second processor to obtain an encoded audio signal having different signal portions represented by coded data in different domains, which have been coded by different coding algorithms. Corresponding decoding stages in the decoder together with a bypass for bypassing a domain converter allow the generation of a decoded audio signal with high quality and low bit rate.

Type: Grant

Filed: January 14, 2011

Date of Patent: November 27, 2012

Assignees: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V., Voiceage Corporation

Inventors: Bernhard Grill, Stefan Bayer, Guillaume Fuchs, Stefan Geyersberger, Ralf Geiger, Johannes Hilpert, Ulrich Kraemer, Jeremie Lecomte, Markus Multrus, Max Neuendorf, Harald Popp, Nikolaus Rettelbach, Roch Lefebvre, Bruno Bessette, Jimmy Lapierre, Philippe Gournay, Redwan Salami
METHOD AND APPRATUS FOR TEMPORAL SPEECH SCORING

Publication number: 20120296642

Abstract: A method and apparatus for speech analysis, comprising detecting an at least one temporal characteristic of an at least one speech of an at least one speaker, and deducing an at least one quantitative score from the at least one temporal characteristic, where the at least one quantitative score indicates an at least one extent of an at least one behavioral aspect of the at least one speaker.

Type: Application

Filed: May 19, 2011

Publication date: November 22, 2012

Applicant: Nice Systems Ltd.

Inventors: Sherrie SHAMMASS, Moshe Wasserblat, Oren Lewkowicz, Liron Aichel, Oded Kalchiem, Ishay Levi, Ronit Ephrat, Adee Lavi, Lior Hadaya
Audio signal quality enhancement apparatus and method

Patent number: 8315862

Abstract: An audio signal quality enhancement apparatus and method. The apparatus includes a pitch calculating unit to extract a pitch period of an audio signal, a frequency domain transforming unit to transform the audio signal to a frequency domain, a frequency band dividing unit to classify the transformed audio signal into audio signals for each of the plurality of frequency bands based on the extracted pitch period, and a pitch enhancement unit to determine a gain based on a volume of the transformed audio signal, and to generate an output signal by multiplying each of the classified audio signals with respect to each of the plurality of frequency bands by the gain, thereby enhancing quality of the audio signal.

Type: Grant

Filed: June 5, 2009

Date of Patent: November 20, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Jung Hoe Kim, Ho Chong Park, Eun Mi Oh
Systems and methods for audio signal analysis and modification

Patent number: 8315857

Abstract: Systems and methods for modification of an audio input signal are provided. In exemplary embodiments, an adaptive multiple-model optimizer is configured to generate at least one source model parameter for facilitating modification of an analyzed signal. The adaptive multiple-model optimizer comprises a segment grouping engine and a source grouping engine. The segment grouping engine is configured to group simultaneous feature segments to generate at least one segment model. The at least one segment model is used by the source grouping engine to generate at least one source model, which comprises the at least one source model parameter. Control signals for modification of the analyzed signal may then be generated based on the at least one source model parameter.

Type: Grant

Filed: May 30, 2006

Date of Patent: November 20, 2012

Assignee: Audience, Inc.

Inventors: David Klein, Stephen Malinowski, Lloyd Watts, Bernard Mont-Reynaud
VOICE COMMUNICATION APPARATUS AND METHOD

Publication number: 20120263302

Abstract: A voice transmission apparatus is provided. The voice transmission apparatus includes a voice input unit, an Analog-to-Digital (AD) converter, a transmission synchronization unit, a channel status determination unit, and a transmission unit. The voice input unit receives an analog voice signal corresponding to a communication session. The AD converter converts the analog voice signal into a digital signal. The transmission synchronization unit generates a transmission target signal by respectively assigning a start identifier and an end identifier to the first frame and final frame of the digital signal corresponding to the communication session. The channel status determination unit determines the status of a channel. The transmission unit transmits the transmission target signal over the channel in such a way that the transmission rate of the transmission target signal is set on a communication session basis depending on the status of the channel.

Type: Application

Filed: April 6, 2012

Publication date: October 18, 2012

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Young-Ho SON, Jang-Hong YOON
Method And Apparatus Of Visual Feedback For Latency In Communication Media

Publication number: 20120265524

Abstract: A method and apparatus are provided for visualizing the latency in a conversation between a local speaker and at least one remote speaker separated from the local speaker by a communication medium. A latency estimate is obtained. A timing indication of at least the end of a conversational turn by the local speaker is obtained, and an outbound graphic is displayed, indicating the progress of at least the end-of-turn across the communication medium toward the remote speaker. The outbound graphical indication is displayed with a transit time across the medium that is derived from the latency estimate. An inbound graphic is displayed, indicating the progress across the communication medium toward the local speaker, of a start of a conversational turn by the remote speaker, which is imputed to begin when the remote speaker receives the local speaker's end-of-turn. The inbound graphical indication is displayed with a transit time across the medium that is derived from the latency estimate.

Type: Application

Filed: April 12, 2011

Publication date: October 18, 2012

Applicant: Alcatel-Lucent USA Inc.

Inventor: James W. McGowan
ENCODING METHOD, DECODING METHOD, ENCODER APPARATUS, DECODER APPARATUS, PROGRAM AND RECORDING MEDIUM

Publication number: 20120265525

Abstract: In encoding, pitch periods for time series signals in a predetermined time interval are calculated, and a code corresponding thereto is output. In that encoding, the resolutions for expressing the pitch periods and/or a pitch period encoding mode are switched according to whether an index indicating a periodicity and/or stationarity level of the time series signals satisfies a condition indicating high or low in periodicity and/or stationarity. In that decoding, according to whether an index indicating a periodicity and/or stationarity level, the index being included in or obtained from an input code corresponding to the predetermined time interval, satisfies a condition indicating high periodicity and/or stationarity, a decoding mode for a code, included in the input code, corresponding to pitch periods is switched to decode the code corresponding to the pitch periods to obtain the pitch periods corresponding to the predetermined time interval.

Type: Application

Filed: January 7, 2011

Publication date: October 18, 2012

Applicant: Nippon Telegraph and Telephone Corporation

Inventors: Takehiro Moriya, Noboru Harada, Yutaka Kamamoto
Method and apparatus for sinusoidal audio coding

Patent number: 8290770

Abstract: Provided are a method and apparatus for sinusoidal audio coding, which employs a tracking method for further effective coding of sinusoids extracted in the process of a sinusoidal analysis of parametric coding. The sinusoidal audio coding method includes: extracting sinusoids of a current frame by performing a sinusoidal analysis on an input audio signal; with respect to each of the extracted sinusoids, setting a mode selected from a birth mode in which a sinusoid is newly generated irrespective of sinusoids of a previous frame, a continuation mode in which the sinusoid is only one sinusoid continued from one of the sinusoids of the previous frame, and a branch mode in which the sinusoid is one of a plurality of sinusoids continued from one of the sinusoids of the previous frame; and coding the extracted sinusoids according to the selected mode. Accordingly, a plurality of sinusoids that can be continued from one previous track component are set to the continuation mode or the branch mode.

Type: Grant

Filed: February 5, 2008

Date of Patent: October 16, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Nam-suk Lee, Geon-hyoung Lee, Jae-one Oh, Chul-woo Lee, Jong-hoon Jeong
System and method for encoding and decoding pulse indices

Patent number: 8280729

Abstract: Methods, and corresponding codec-containing devices are provided that have source coding schemes for encoding a component of an excitation. In some cases, the source coding scheme is an enumerative source coding scheme, while in other cases the source coding scheme is an arithmetic source coding scheme. In some cases, the source coding schemes are applied to encode a fixed codebook component of the excitation for a codec employing codebook excited linear prediction, for example an AMR-WB (Adaptive Multi-Rate-Wideband) speech codec.

Type: Grant

Filed: January 22, 2010

Date of Patent: October 2, 2012

Assignee: Research In Motion Limited

Inventors: Xiang Yu, Dake He, En-hui Yang
TERMINAL DEVICE, AUDIO OUTPUT METHOD, AND INFORMATION PROCESSING SYSTEM

Publication number: 20120245929

Abstract: In an audio output terminal device, a buffer control unit adjusts the buffer size of a jitter buffer in accordance with the setting of a sound output mode instructed in an instruction receiving unit. If the instruction receiving unit acknowledges an instruction for setting an audio output mode that requires low delay in outputting sound, the buffer control unit reduces the buffer size of the jitter buffer. Further, the buffer control unit controls, in accordance with the instructed setting of the sound output mode, timing for allowing a media buffer to transmit one or more voice packets to the jitter buffer.

Type: Application

Filed: September 16, 2010

Publication date: September 27, 2012

Applicant: SONY COMPUTER ENTERTAINMENT INC.

Inventors: Kiyoto Shibuya, Jin Nakamura, Katsuhiko Shibata, Kazuhiro Yanase, Akitoshi Yamaguchi, Akiyoshi Morita, Kouichi Kazama
Apparatus performing translation process from inputted speech

Patent number: 8275603

Abstract: A speech translating apparatus includes a input unit, a speech recognizing unit, a translating unit, a first dividing unit, a second dividing unit, an associating unit, and an outputting unit. The input unit inputs a speech in a first language. The speech recognizing unit generates a first text from the speech. The translating unit translates the first text into a second language and generates a second text. The first dividing unit divides the first text and generates first phrases. The second dividing unit divides the second text and generates second phrases. The associating unit associates semantically equivalent phrases within each group of phrases. The outputting unit sequentially outputs the associated phrases in a phrase order within the second text.

Type: Grant

Filed: September 4, 2007

Date of Patent: September 25, 2012

Assignee: Kabushiki Kaisha Toshiba

Inventors: Kentaro Furihata, Tetsuro Chino, Satoshi Kamatani
Signature noise removal

Patent number: 8271279

Abstract: A speech enhancement system improves the perceptual quality of a processed voice signal. The system improves the perceptual quality of a voice signal by removing unwanted noise components from a voice signal. The system removes undesirable signals that may result in the loss of information. The system receives and analyzes signals to determine whether an undesired random or persistent signal corresponds to one or more modeled noises. When one or more noise components are detected, the noise components are substantially removed or dampened from the signal to provide a less noisy voice signal.

Type: Grant

Filed: November 30, 2006

Date of Patent: September 18, 2012

Assignee: QNX Software Systems Limited

Inventors: Phillip A. Hetherington, Shreyas A. Paranjpe
Audio decoding using variable-length codebook application ranges

Patent number: 8271293

Abstract: Provided are, among other things, systems, methods and techniques for decoding an audio signal from a frame-based bit stream. Each frame includes processing information pertaining to the frame and entropy-encoded quantization indexes representing audio data within the frame. The processing information includes: (i) code book indexes, (ii) code book application information specifying ranges of entropy-encoded quantization indexes to which the code books are to be applied, and (iii) window information. The entropy-encoded quantization indexes are decoded by applying the identified code books to the corresponding ranges of entropy-encoded quantization indexes. Subband samples are then generated by dequantizing the decoded quantization indexes, and a sequence of different window functions that were applied within a single frame of the audio data is identified based on the window information.

Type: Grant

Filed: March 28, 2011

Date of Patent: September 18, 2012

Assignee: Digital Rise Technology Co., Ltd.

Inventor: Yuli You
Scalable encoding device, and scalable encoding method

Patent number: 8271275

Abstract: A scalable encoding device capable of reducing an encoding rate to reduce a circuit scale while preventing sound quality deterioration of a decoded signal. An extension layer is coarsely divided into a system for processing a first channel and a system for processing a second channel. A sound source predictor for processing the first channel predicts a drive sound source signal of the first channel from a drive sound source signal of a monaural signal, and outputs the predicted drive sound source signal through a multiplier to a first CELP encoder. A sound source predictor for processing the second channel predicts the drive sound source signal of the second channel from the drive sound source signal of the monaural signal and the output from the first CELP encoder, and outputs the predicted drive sound source signal through a multiplier to a second CELP encoder.

Type: Grant

Filed: May 29, 2006

Date of Patent: September 18, 2012

Assignee: Panasonic Corporation

Inventors: Michiyo Goto, Koji Yoshida
METHOD, DEVICE AND SYSTEM FOR VOICE ENCODING/DECODING

Publication number: 20120221327

Abstract: A method, a device and a system for voice encoding/decoding are disclosed in the present invention. The method includes: assembling an input pulse code modulation signal into one signal according to a designated time slot and assembly manner; and encoding the assembled signal according to a designated encoding manner to output an encoded voice signal. In the present invention, because a process of assembling or splitting the signal may be implemented through software, in the case that hardware in a current network does not need to be replaced, an effect of encoding/decoding voice with a 7 K spectrum may be achieved in the current network.

Type: Application

Filed: May 4, 2012

Publication date: August 30, 2012

Applicant: Huawei Technologies Co., Ltd.

Inventors: Xiaoshuang Li, Xingguo Gao
Voice mixing method and multipoint conference server and program using the same method

Patent number: 8255206

Abstract: The voice mixing method includes a first step for selecting voice information from a plurality of voice information, a second step for adding up all the selected voice information, a third step for obtaining a voice signal totaling the voice signals other than one voice signal, of the selected voice signals, a fourth step for encoding the voice information obtained in the second step, a fifth step for encoding the voice signal obtained in the third step, and a sixth step for copying the encoded information obtained in the fourth step into the encoded information in the fifth step.

Type: Grant

Filed: August 28, 2007

Date of Patent: August 28, 2012

Assignee: NEC Corporation

Inventors: Hironori Ito, Kazunori Ozawa
SPEECH RECOGNITION SYSTEM, SPEECH RECOGNITION REQUEST DEVICE, SPEECH RECOGNITION METHOD, SPEECH RECOGNITION PROGRAM, AND RECORDING MEDIUM

Publication number: 20120215528

Abstract: Provided is a speech recognition system, including: a first information processing device including a speech recognition processing unit for receiving data to be used for speech recognition transmitted via a network, carrying out speech recognition processing, and returning resultant data; and a second information processing device connected to the first information processing device via the network. The second information processing device performs conversion of the data into data having a format that disables a content thereof from being perceived and also enables the speech recognition processing unit to perform the speech recognition processing. Thereafter, the second information processing device transmits the data to be used for the speech recognition by the speech recognition processing unit and constructs resultant data returned from the first information processing device into a content of a valid and perceivable recognition result.

Type: Application

Filed: October 12, 2010

Publication date: August 23, 2012

Applicant: NEC CORPORATION

Inventor: Kentaro Nagatomo
Measuring double talk performance

Patent number: 8244538

Abstract: A system evaluates a hands free communication system. The system automatically selects a consonant-vowel-consonant (CVC), vowel-consonant-vowel (VCV), or other combination of sounds from an intelligent database. The selection is transmitted with another communication stream that temporally overlaps the selection. The quality of the communication system is evaluated through an automatic speech recognition engine. The evaluation occurs at a location remote from the transmitted selection.

Type: Grant

Filed: April 29, 2009

Date of Patent: August 14, 2012

Assignee: QNX Software Systems Limited

Inventors: Shreyas Paranjpe, Mark Fallat
Audience state estimation system, audience state estimation method, and audience state estimation program

Patent number: 8244537

Abstract: Video signal relative to an imaged audience and audio signal according to voices from the audience are generated in an input unit. A characteristic amount detection unit detects information on a movement amount, movement periodicity, a volume, voice periodicity of the audience, and a frequency component of voices from the audience based on the video signal or the audio signal. An estimation unit estimates an audience state based on the detected result. An output unit outputs the estimated result of the audience state. The audience state can be easily estimated without observing the audience state by a person.

Type: Grant

Filed: May 13, 2008

Date of Patent: August 14, 2012

Assignee: Sony Corporation

Inventors: Tetsujiro Kondo, Yuji Okumura, Koichi Fujishima, Tomoyuki Ohtsuki
Time-warping frames of wideband vocoder

Patent number: 8239190

Abstract: A method of communicating speech comprising time-warping a residual low band speech signal to an expanded or compressed version of the residual low band speech signal, time-warping a high band speech signal to an expanded or compressed version of the high band speech signal, and merging the time-warped low band and high band speech signals to give an entire time-warped speech signal. In the low band, the residual low band speech signal is synthesized after time-warping of the residual low band signal while in the high band, an unwarped high band signal is synthesized before time-warping of the high band speech signal. The method may further comprise classifying speech segments and encoding the speech segments. The encoding of the speech segments may be one of code-excited linear prediction, noise-excited linear prediction or ? frame (silence) coding.

Type: Grant

Filed: August 22, 2006

Date of Patent: August 7, 2012

Assignee: QUALCOMM Incorporated

Inventors: Rohit Kapoor, Serafin Diaz Spindola
Providing enhanced content

Patent number: 8234411

Abstract: Methods, systems, computer readable media, and apparatuses for providing enhanced content are presented. Data including a first program, a first caption stream associated with the first program, and a second caption stream associated with the first program may be received. The second caption stream may be extracted from the data, and a second program may be encoded with the second caption stream. The first program may be transmitted with the first caption stream including first captions and may include first content configured to be played back at a first speed. In response to receiving an instruction to change play back speed, the second program may be transmitted with the second caption stream. The second program may include the first content configured to be played back at a second speed different from the first speed, and the second caption stream may include second captions different from the first captions.

Type: Grant

Filed: September 2, 2010

Date of Patent: July 31, 2012

Assignee: Comcast Cable Communications, LLC

Inventor: Ross Gilson
Closed caption production device, method and program for synthesizing video, sound and text

Patent number: 8223269

Abstract: In a closed caption production device, video recognition processing of an input video signal is performed by a video recognizer. This causes a working object in video to be recognized. In addition, a sound recognizer performs sound recognition processing of an input sound signal. This causes a position of a sound source to be estimated. A controller performs linking processing by comparing information of the working object recognized by the video recognition processing with positional information of the sound source estimated by the sound recognition processing. This causes a position of a closed caption produced based on the sound signal to be set in the vicinity of the working object in the video.

Type: Grant

Filed: September 19, 2007

Date of Patent: July 17, 2012

Assignee: Panasonic Corporation

Inventor: Isao Ikegami
Error detection and prevention inacoustic data

Patent number: 8223136

Abstract: A stream of raw acoustic data can be received at a client device. The client device can frame the stream of raw acoustic data at particular intervals with alignment information to create framed acoustic data, and buffer the framed acoustic data while waiting for a data request from a host device. In response to receiving the data request, the client device can provide the framed acoustic data to the host device.

Type: Grant

Filed: June 7, 2005

Date of Patent: July 17, 2012

Assignee: Intel Corporation

Inventors: Yongge Hu, Ying Jia
CREATION AND USE OF TEST CASES FOR AUTOMATED TESTING OF MEDIA-BASED APPLICATIONS

Publication number: 20120179460

Abstract: A method for testing an automated interactive media system. The method can include establishing a communication session with the automated interactive media system. In response to receiving control and/or media information from the automated interactive media system, pre-recorded control and/or media information can be propagated to the automated interactive media system. The pre-recorded control and/or media information can be recorded in real time.

Type: Application

Filed: March 17, 2012

Publication date: July 12, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: WILLIAM V. DA PALMA, BRIEN H. MUSCHETT
Audio wave field encoding

Patent number: 8219409

Abstract: An encoder/decoder for multi-channel audio data, and in particular for audio reproduction through wave field synthesis. The encoder comprises a two-dimensional filter-bank to the multi-channel signal, in which the channel index is treated as an independent variable as well as time, and and the resulting spectral coefficient are quantized according to a two-dimensional psychoacoustic model, including masking effect in the spatial frequency as well as in the temporal frequency. The coded spectral data are organized in a bitstream together with side information containing scale factors and Huffman codebook identifiers.

Type: Grant

Filed: March 31, 2008

Date of Patent: July 10, 2012

Assignee: Ecole Polytechnique Federale De Lausanne

Inventors: Martin Vetterli, Francisco Pereira Correia Pinto
Methods and arrangements for a speech/audio sender and receiver

Patent number: 8214202

Abstract: An audio/speech sender and an audio/speech receiver and methods thereof. The audio/speech sender comprising a core encoder adapted to encode a core frequency band of an input audio/speech signal having a first sampling frequency, wherein the core frequency band comprises frequencies up to a cut-off frequency. The audio/speech sender further comprises a segmentation device adapted to perform a segmentation of the input audio/speech signal into a plurality of segments, a cut-off frequency estimator adapted to estimate a cut-off frequency for each segment and adapted to transmit information about the estimated cut-off frequency to a decoder, a low-pass filter adapted to filter each segment at said estimated cut-off frequency, and a re-sampler adapted to resample the filtered segments with a second sampling frequency that is related to said cut-off frequency in order to generate an audio/speech frame to be encoded by said core encoder.

Type: Grant

Filed: September 13, 2006

Date of Patent: July 3, 2012

Assignee: Telefonaktiebolaget L M Ericsson (publ)

Inventor: Stefan Bruhn
Information display apparatus, information display method and program therefor

Patent number: 8212922

Abstract: An information display apparatus includes a display device configured to display a video, a speech detection unit configured to detect a playback state of a playback speech, a closed caption display unit configured to generate character information associated with the playback speech and display it on the display device together with the video, and a closed caption display unit configured to carry out a changing control for changing according to the detected playback state a display state of the character information that is displayed on the display device by the closed caption display unit.

Type: Grant

Filed: September 26, 2007

Date of Patent: July 3, 2012

Assignee: Kabushiki Kaisha Toshiba

Inventors: Kohei Momosaki, Kazuhiko Abe, Yasuyuki Masai, Makoto Yajima, Koichi Yamamoto, Munehiko Sasajima
Speech communications system for a vehicle and method of operating a speech communications system for a vehicle

Patent number: 8214219

Abstract: A speech communications system for a vehicle includes a microphone system provided in the vehicle interior in order to detect audio information. An interaction manager provides grammar information to a speech recognizer. The speech recognizer provides speech recognition results to the interaction manager. An acoustic echo canceller eliminates portions of the audio information detected by the microphone system. A sound localizer determines a sound source location in the vehicle interior. A method of operating a speech communications system in a vehicle is also provided. An interruptible text-to-speech operation provides a speech output to a user. Voice information is requested from the user for a maximum number of times if insufficient voice information or no voice information is provided in response to the speech output provided by the interruptible text-to-speech operation. The dialog context of an unfinished speech interaction is saved.

Type: Grant

Filed: September 15, 2006

Date of Patent: July 3, 2012

Assignee: Volkswagen of America, Inc.

Inventors: Ramon Prieto, Rohit Mishra
Systems and Methods for Transmitting Media Content via Digital Radio Broadcast Transmission for Synchronized Rendering by a Receiver

Publication number: 20120162512

Abstract: Systems, methods, and processor readable media are disclosed for encoding and transmitting first media content and second media content using a digital radio broadcast system, such that the second media content can be rendered in synchronization with the first media content by a digital radio broadcast receiver. The disclosed systems, methods, and processor-readable media determine when a receiver will render audio and data content that is transmitted at a given time by the digital radio broadcast transmitter, and adjust the media content accordingly to provide synchronized rendering. In exemplary embodiments, these adjustments can be provided by: 1) inserting timing instructions specifying playback time in the secondary content based on calculated delays; or 2) controlling the timing of sending the primary or secondary content to the transmitter so that it will be rendered in synchronization by the receiver.

Type: Application

Filed: February 21, 2012

Publication date: June 28, 2012

Applicant: iBiquity Digital Corporation

Inventors: Steven Andrew Johnson, Muthu Gopal Balasubramanian, Harvey Chalmers, Jeffrey Ranken Detweiler, Albert John Gambardella, Russell Iannuzzelli, Stephen Douglas Mattson
Emotion recognition system

Patent number: 8209182

Abstract: An emotion recognition system for assessing human emotional behavior from communication by a speaker includes a processing system configured to receive signals representative of the verbal and/or non-verbal communication. The processing system derives signal features from the received signals. The processing system is further configured to implement at least one intermediate mapping between the signal features and one or more elements of an emotional ontology in order to perform an emotion recognition decision. The emotional ontology provides a gradient representation of the human emotional behavior.

Type: Grant

Filed: November 30, 2006

Date of Patent: June 26, 2012

Assignee: University of Southern California

Inventor: Shrikanth S. Narayanan
Stereo decoder that conceals a lost frame in one channel using data from another channel

Patent number: 8209168

Abstract: An audio data transmitting/receiving apparatus for realizing a high-quality frame compensation in audio communications. In an audio data transmitting apparatus (10), a delay part (104) subjects multi-channel audio data to a delay process that delays the L-ch encoded data relative to the R-ch encoded data by a predetermined delay amount. A multiplexing part (106) multiplexes the audio data as subjected to the delay process. A transmitting part (108) transmits the audio data as multiplexed. In an audio data receiving apparatus (20), a separating part (114) separates, for each channel, the audio data received from the audio data transmitting apparatus (10). A decoding part (118) decodes, for each channel, the audio data as separated. If there has occurred a loss or error in the audio data as separated, then a frame compensating part (120) uses one of the L-ch and R-ch encoded data to compensate for the loss or error in the other encoded data.

Type: Grant

Filed: May 20, 2005

Date of Patent: June 26, 2012

Assignee: Panasonic Corporation

Inventor: Koji Yoshida
System And Method For Adjusting Floor Controls Based On Conversational Characteristics Of Participants

Publication number: 20120158402

Abstract: A system and method for automatically adjusting floor controls based on conversational characteristics is provided. Audio streams are received, which each originate from an audio source. Floor controls for a current configuration including at least a portion of the audio streams are maintained. Conversational characteristics shared by two or more of the audio sources are determined. Possible configurations for the audio streams are identified based on the conversational characteristics. An analysis of the current configuration and the possible configurations is performed. A change threshold comprising a minimum number of timeslices for at least one of the current configuration and one of the possible configurations is applied to the analysis. When the analysis satisfies the change threshold, the floor controls are automatically adjusted. The audio streams are mixed into one or more outputs based on the adjusted floor controls.

Type: Application

Filed: February 27, 2012

Publication date: June 21, 2012

Applicant: PALO ALTO RESEARCH CENTER INCORPORATED

Inventors: Paul M. Aoki, Margaret H. Szymanski, James D. Thornton, Daniel H. Wilson, Allison G. Woodruff
VOICE REPRODUCTION APPARATUS AND VOICE REPRODUCTION METHOD

Publication number: 20120158403

Abstract: A voice reproduction apparatus includes an ambient sound analysis unit to analyze a characteristic of an ambient sound, a characteristic analysis unit to analyze an acoustic characteristic of a signal for reproduction, a reproduction timing adjusting unit to record the signal for reproduction and to read the signal for reproduction at a reproduction timing of follow-up reproduction, a reproduction speed changing unit to change a reproduction speed of the read signal for reproduction, and a control unit to control the reproduction timing adjusting unit so that the signal for reproduction is reproduced at the reproduction timing corresponding to an analysis result of the ambient sound analysis unit and to control the reproduction speed changing unit so that the signal for reproduction is reproduced at the reproduction speed corresponding to the analysis result of the ambient sound analysis unit and the acoustic characteristic obtained by the characteristic analysis unit.

Type: Application

Filed: March 1, 2012

Publication date: June 21, 2012

Applicant: FUJITSU LIMITED

Inventors: Taro TOGAWA, Takeshi Otani, Kaori Endo, Yasuji Ota
Method and system for asymmetric independent audio rendering

Patent number: 8200479

Abstract: Methods and mobile devices are provided for asymmetric independent processing of audio streams in a system on a chip (SOC). More specifically, independent audio paths are provided for processors performing audio processing on the SOC and mixing of decoded audio samples from the processors is performed digitally on the SOC by a hardware digital mixer.

Type: Grant

Filed: December 23, 2008

Date of Patent: June 12, 2012

Assignee: Texas Instruments Incorporated

Inventors: Stephane Sintes, Franck Seigneret, Christophe Favergeon-Borgialli
Speech Synthesis information Editing Apparatus

Publication number: 20120143600

Abstract: In a speech synthesis information editing apparatus, a phoneme storage unit stores phoneme information that designates a duration of each phoneme of speech to be synthesized. A feature storage unit stores feature information that designates a time variation in a feature of the speech. An edition processing unit changes a duration of each phoneme designated by the phoneme information with an expansion/compression degree depending on a feature designated by the feature information in correspondence to the phoneme.

Type: Application

Filed: December 1, 2011

Publication date: June 7, 2012

Applicant: Yamaha Corporation

Inventor: Tatsuya IRIYAMA
Apparatus and method for detecting speech and music portions of an audio signal

Patent number: 8195451

Abstract: In an information detecting apparatus (1), a speech kind discrimination unit (11) discriminates and classifies an audio signal at an information source into kind (category) such as music or speech, etc. on a predetermined time basis, and a memory unit/recording medium (13) records discrimination information thereof. A discrimination frequency calculating unit (15) calculates, on a predetermined time basis, discrimination frequency every kind at a predetermined time period longer than the time unit.

Type: Grant

Filed: February 10, 2004

Date of Patent: June 5, 2012

Assignee: Sony Corporation

Inventor: Yasuhiro Toguri
Low-complexity, non-intrusive speech quality assessment

Patent number: 8195449

Abstract: A non-intrusive signal quality assessment apparatus includes a feature vector calculator that determines parameters representing frames of a signal and extracts a collection of per-frame feature vectors (?;(n)) representing structural information of the signal from the parameters. A frame selector preferably selects only frames (?\with a feature vector (?;(n)) lying within a predetermined multi-dimensional window (?). Means determine a global feature set (?) over the collection of feature vectors (?;(n)) from statistical moments of selected feature vector components ((1^,02, . . . O11). A quality predictor predicts a signal quality measure (Qj from the global feature set (?)).

Type: Grant

Filed: January 30, 2007

Date of Patent: June 5, 2012

Assignee: Telefonaktiebolaget L M Ericsson (Publ)

Inventors: Stefan Bruhn, Volodya Grancharov, Willem Bastiaan Kleijn
Reproducing apparatus

Patent number: 8165888

Abstract: Disclosed is a reproducing apparatus comprising: a reproduction section to reproduce reproduction data comprising sound data and/or image data; a selection section to calculate evaluation values between a link source set for the reproduction data and each of a plurality of link destinations corresponding to the link source by a predetermined arithmetic expression based on link information of the plurality of link destinations, and to select a link destination having a highest evaluation among the evaluation values out of the plurality of link destinations; and a reproduction control section to move a reproduction point of the reproduction data reproduced by the reproduction section to a position corresponding to the link destination by linking the link source with the link destination when the reproduction point reaches a given point with respect to a position corresponding to the link source, and to instruct the reproduction section to reproduce the reproduction data.

Type: Grant

Filed: March 14, 2008

Date of Patent: April 24, 2012

Assignees: The University of Electro-Communications, Funai Electric Co., Ltd.

Inventors: Kota Takahashi, Yasuo Masaki
Method, apparatus and program for speech synthesis

Patent number: 8165882

Abstract: Apparatus and method for generating high quality synthesized speech having smooth waveform concatenation. The apparatus includes a pitch frequency calculation section, a pitch synchronization position calculation section, a unit waveform storage, a unit waveform selection section, a unit waveform generation section, and a waveform synthesis section. The unit waveform generation section includes a conversion ratio calculation section, a sampling rate conversion section, and a unit waveform re-selection section. The conversion ratio calculation section calculates a sampling rate conversion ratio from the pitch information and the position of pitch synchronization, and the sampling rate conversion section converts the sampling rate of the unit waveform, delivered as input, based on the sampling rate conversion ratio.

Type: Grant

Filed: September 4, 2006

Date of Patent: April 24, 2012

Assignee: NEC Corporation

Inventors: Masanori Kato, Satoshi Tsukada
Audio signal coding method and decoding method

Patent number: 8160890

Abstract: It possible not only to reduce a delay, but also to enhance the coding efficiency and reduce audio artifact upon coding.

Type: Grant

Filed: December 5, 2007

Date of Patent: April 17, 2012

Assignee: Panasonic Corporation

Inventors: Mineo Tsushima, Akihisa Kawamura
Time warping frames inside the vocoder by modifying the residual

Patent number: 8155965

Abstract: In one embodiment, the present invention comprises a vocoder having at least one input and at least one output, an encoder comprising a filter having at least one input operably connected to the input of the vocoder and at least one output, a decoder comprising a synthesizer having at least one input operably connected to the at least one output of the encoder, and at least one output operably connected to the at least one output of the vocoder, wherein the encoder comprises a memory and the encoder is adapted to execute instructions stored in the memory comprising classifying speech segments and encoding speech segments, and the decoder comprises a memory and the decoder is adapted to execute instructions stored in the memory comprising time-warping a residual speech signal to an expanded or compressed version of the residual speech signal.

Type: Grant

Filed: May 5, 2005

Date of Patent: April 10, 2012

Assignee: QUALCOMM Incorporated

Inventors: Rohit Kapoor, Serafin Diaz Spindola
Seamless audio speed change based on time scale modification

Patent number: 8155972

Abstract: This invention involves time-scale modification of audio signals. The invention describes overlap and add time scale modification with variable input and output buffer sizes. Seamless speed change is achieved by keeping track of previously processed data to avoid discontinuities during playback speed transitions.

Type: Grant

Filed: October 5, 2005

Date of Patent: April 10, 2012

Assignee: Texas Instruments Incorporated

Inventors: Atsuhiro Sakurai, Yoshihide Iwata
Device and method for generating a complex spectral representation of a discrete-time signal

Patent number: 8155954

Abstract: A filter bank device for generating a complex spectral representation of a discrete-time signal includes a generator for generating a block-wise real spectral representation, which, for example, implements an MDCT, to obtain temporally successive blocks of real spectral coefficients. The output values of this spectral conversion device are fed to a post-processor for post-processing the block-wise real spectral representation to obtain an approximated complex spectral representation having successive blocks, each block having a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and by a second partial spectral coefficient, wherein at least one of the first and second partial spectral coefficients is determined by combining at least two real spectral coefficients.

Type: Grant

Filed: March 4, 2010

Date of Patent: April 10, 2012

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Bernd Edler, Stefan Geyersberger
SYSTEM AND METHOD FOR PERFORMING SPEECH ANALYTICS

Publication number: 20120084081

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing trend analysis of speech. A system practicing the method receives a speech trend analysis request having candidate feature constraints, an objective function with respect to a speech trend to be analyzed, and a set of speech record constraints. The system selects a subset of speech records from the group of speech records based on the set of speech record constraints to yield selected speech records, identifies features in the selected speech records based on the set of candidate feature constraints to yield identified features, and assigns a weight to each of the identified features based on the objective function. Then the system ranks the identified features by their respective weights to yield ranked identified features, and outputs at least one of the ranked identified features associated with a speech-based trend in response to the speech trend analysis request.

Type: Application

Filed: September 30, 2010

Publication date: April 5, 2012

Applicant: AT&T Intellectual Property I, L.P.

Inventors: ILYA Dan MELAMED, Mazin Gilbert

prev 1 2 3 4 5 6 7 8 9 … next