Modification Of At Least One Characteristic Of Speech Waves (epo) Patents (Class 704/E21.001)

E Subclasses

Speech enhancement, e.g., noise reduction, echo cancellation, etc. (epo) (Class 704/E21.002)

Time compression or expansion (epo) (Class 704/E21.017)

Suppression or repetition of time signal segments (EPO) (Class 704/E21.018)

Transformation of speech into a nonaudible representation, e.g., speech visualization, speech processing for tactile aids, etc. (epo) (Class 704/E21.019)

Synchronization of speech with image or synthesis of the lips movement from speech, e.g., for "talking heads," etc.(EPO) (Class 704/E21.02)

VOICE INPUT DEVICE, METHOD FOR MANUFACTURING THE SAME, AND INFORMATION PROCESSING SYSTEM

Publication number: 20110172996

Abstract: A voice input device, a method for manufacturing the same, and an information processing system are provided. The voice input device has a function of removing a noise component and includes a first microphone 710-1 that includes a first vibrating membrane, a second microphone 710-2 that includes a second vibrating membrane, and a differential signal generation section 720 that generates a differential signal that represents a difference between a first voltage signal and a second voltage signal. The first and second vibrating membranes are disposed so that a noise intensity ratio is smaller than an input voice intensity ratio that represents the ratio to intensity of an input voice component.

Type: Application

Filed: May 20, 2009

Publication date: July 14, 2011

Applicants: FUNAI ELECTRIC CO., LTD., FUNAI ELECTRIC ADVANCED APPLIED TECHNOLOGY RESEARCH INSTITUTE INC.

Inventors: Rikuo Takano, Kiyoshi Sugiyama, Toshimi Fukuoka, Masatoshi Ono, Ryusuke Horibe, Fuminori Tanaka, Takeshi Inoda
AUTOMATIC VIDEO STREAM SELECTION

Publication number: 20110164105

Abstract: A handheld communication device is used to capture video streams and generate a multiplexed video stream. The handheld communication device has at least two cameras facing in two opposite directions. The handheld communication device receives a first video stream and a second video stream simultaneously from the two cameras. The handheld communication device detects a speech activity of a person captured in the video streams. The speech activity may be detected from direction of sound or lip movement of the person. Based on the detection, the handheld communication device automatically switches between the first video stream and the second video stream to generate a multiplexed video stream. The multiplexed video stream interleaves segments of the first video stream and segments of the second video stream. Other embodiments are also described and claimed.

Type: Application

Filed: January 6, 2010

Publication date: July 7, 2011

Applicant: Apple Inc.

Inventors: Jae Han Lee, E-Cheng Chang
Embedded Speech and Audio Coding Using a Switchable Model Core

Publication number: 20110161087

Abstract: A method for processing an audio signal including classifying an input frame as either a speech frame or a generic audio frame, producing an encoded bitstream and a corresponding processed frame based on the input frame, producing an enhancement layer encoded bitstream based on a difference between the input frame and the processed frame, and multiplexing the enhancement layer encoded bitstream, a codeword, and either a speech encoded bitstream or a generic audio encoded bitstream into a combined bitstream based on whether the codeword indicates that the input frame is classified as a speech frame or as a generic audio frame, wherein the encoded bitstream is either a speech encoded bitstream or a generic audio encoded bitstream.

Type: Application

Filed: December 31, 2009

Publication date: June 30, 2011

Applicant: Motorola, Inc.

Inventors: James P. ASHLEY, Jonathan A. Gibbs, Udar Mittal
SYSTEMS AND METHODS FOR IDENTIFYING SPEECH SOUND FEATURES

Publication number: 20110153321

Abstract: Systems and methods for detecting features in spoken speech and processing speech sounds based on the features are provided. One or more features may be identified in a speech sound. The speech sound may be modified to enhance or reduce the degree to which the feature affects the sound ultimately heard by a listener. Systems and methods according to embodiments of the invention may allow for automatic speech recognition devices that enhance detection and recognition of spoken sounds, such as by a user of a hearing aid or other device.

Type: Application

Filed: July 2, 2009

Publication date: June 23, 2011

Applicant: THE BOARD OF TRUSTEES OF THE UNIVERSITY OF ILLINOI

Inventors: Jont B. Allen, Feipeng LI
Device and Method for Booting Handheld Apparatus by Voice Control

Publication number: 20110153332

Abstract: A device for booting a handheld apparatus by voice control includes a base, a power-on device, a trigger switch, and an acoustic sensor. Upon the handheld apparatus being placed at the base to trigger the trigger switch, the trigger switch controls the power-on device to power on the handheld apparatus. After the handheld apparatus is powered on, the acoustic sensor detects a sound of the handheld apparatus and then controls a pressure head of the power-on device to move away. The device and its method for booting a handheld apparatus by voice control come with the advantages of a simple and easy operation and a high efficiency.

Type: Application

Filed: September 30, 2010

Publication date: June 23, 2011

Applicants: INVENTEC APPLIANCES (SHANGHAI) CO. LTD., INVENTEC APPLIANCES CORP.

Inventor: Shang-Fei Hu
AUDIO AND SPEECH PROCESSING WITH OPTIMAL BIT-ALLOCATION FOR CONSTANT BIT RATE APPLICATIONS

Publication number: 20110153315

Abstract: Methods and apparatus for audio and speech processing including generating a plurality of frames, each of the frames comprising a plurality of transform coefficients, and allocating bits to the transform coefficients in each of the frames such that at least two of the transform coefficients in the same frame have different bit allocations and the total number of the bits allocated to the transform coefficients in at least two of the frames is equal.

Type: Application

Filed: February 2, 2010

Publication date: June 23, 2011

Applicant: QUALCOMM Incorporated

Inventors: Somdeb Majumdar, Amin Fazeldehkordi, Harinath Garudadri
SIGNAL PROCESSING METHOD AND APPARATUS

Publication number: 20110150227

Abstract: Provided is a signal processing method which calculates a correlation coefficient indicating the degree of relation in a stereo signal and extracts a speech signal from the stereo signal by using the correlation coefficient and the stereo signal.

Type: Application

Filed: October 28, 2010

Publication date: June 23, 2011

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Sun-min KIM
Method for Generating Voice Signal in E-Books and an E-Book Reader

Publication number: 20110153331

Abstract: The present invention provides a method for generating voice signal in electronic books (E-books). The method includes the steps of: receiving a voice signal in response to a triggering signal for placing a bookmark; and displaying a functional icon of the bookmark corresponding to the voice signal in a region of the E-book. The present invention also provides a E-book reader, including: a display unit, a receiver unit, and a processing unit, wherein the receiver unit receivers a voice signal in response to a triggering signal for placing a bookmark, and the processing unit is used to display a functional icon of the bookmark corresponding to the voice signal in a region of the E-book.

Type: Application

Filed: May 13, 2010

Publication date: June 23, 2011

Applicant: INVENTEC APPLIANCES CORP.

Inventors: Yuan-Hua DONG, Liang HUANG, Shih-Kuang TSAI
EMBEDDER FOR EMBEDDING A WATERMARK INTO AN INFORMATION REPRESENTATION, DETECTOR FOR DETECTING A WATERMARK IN AN INFORMATION REPRESENTATION, METHOD AND COMPUTER PROGRAM

Publication number: 20110144998

Abstract: An embedder for embedding a watermark to be embedded into an input information representation comprises an embedding parameter determiner that is implemented to apply a derivation function once or several times to an initial value to obtain an embedding parameter for embedding the watermark into the input information representation. Further, the embedder comprises a watermark adder that is implemented to provide the input information representation with the watermark using the embedding parameter. The embedder is implemented to select how many times the derivation function is to be applied to the initial value.

Type: Application

Filed: March 3, 2009

Publication date: June 16, 2011

Inventors: Bernhard Grill, Ernst Eberlein, Stefan Kraegeloh, Joerg Pickel, Juliane Borsum
Decoding speech signals

Publication number: 20110137644

Abstract: A method, terminal and program for processing a speech signal, in which the speech signal is received over a network from a transmitting device, wherein the frequency components in the received speech signal are limited to a predetermined frequency range and the received speech signal has been filtered using a transmitter frequency response over the predetermined frequency range. The received speech signal is decoded. The decoded speech signal is filtered using a receiver frequency response which is complementary to the transmitter frequency response over the predetermined frequency range to thereby reduce distortion in the speech signal introduced over the predetermined frequency range by using said transmitter frequency response.

Type: Application

Filed: October 6, 2010

Publication date: June 9, 2011

Applicant: Skype Limited

Inventors: Mattias Nilsson, Stefan Strommer, Soren Vang Andersen
REAL-TIME VOIP COMMUNICATIONS USING N-WAY SELECTIVE LANGUAGE PROCESSING

Publication number: 20110134910

Abstract: A computer-implemented method and system of enabling concurrent real-time multi-language communication between multiple participants using a selective broadcast protocol, the method including receiving at a first server a real-time communication from a first participant, the real-time communication being addressed to a second participant constructed in a first spoken language. A preferred spoken language of receipt of real-time communication is identified by the second participant. A determination is made whether the preferred spoken language of receipt is different than that of the first spoken language of the real-time communication.

Type: Application

Filed: December 8, 2009

Publication date: June 9, 2011

Applicant: International Business Machines Corporation

Inventors: Chi-Chuen Chao-Suren, Ezra D.B. Hall, Pascal A. Nsame, Ayidn Suren, Sebastien T. Ventrone
SYSTEM AND METHODS FOR FACILITATING COLLABORATION OF A GROUP

Publication number: 20110134204

Abstract: A system and method for facilitating collaboration of a group. The system and method provide a ubiquitous anytime/everywhere environment realized through fixed and mobile technologies and scaffolded by group support software. The system includes a collaboration engine having an architecture that supports both generic collaborative processes along with task specific team processes instantiated through a sophisticated suite of advanced modular technologies. The collaboration engine drives dynamic and real time collaborative problem solving and decision making by integrating sensor and human data from the field with group support software that efficiently and effectively manages team interaction.

Type: Application

Filed: December 5, 2008

Publication date: June 9, 2011

Applicant: FLORIDA GULF COAST UNIVERSITY

Inventors: Walter Rodriguez, Augusto Opdenbosch, Deborah S. Carstens, Brian Goldiez, Stephen M. Fiore, Veton Kepuska
MULTI-MODE SPEECH RECOGNITION

Publication number: 20110131040

Abstract: A method and an in-vehicle system having a speech recognition component are provided for improving speech recognition performance. The speech recognition component may have multiple vocabulary dictionaries, each of which may include phonetics associated with commands. When the in-vehicle system receives speech input, the speech recognition component may determine whether the received speech input includes a speech access command. If the received speech input is determined to include a speech access command, then a dictionary changing component may transition a currently-used dictionary of the speech recognition component to a vocabulary dictionary associated with the determined speech access command. Otherwise, the dictionary changing component may transition the currently-used dictionary to a first vocabulary dictionary. A command included in the received speech input may then be recognized by the speech recognition component using the transitioned currently-used dictionary.

Type: Application

Filed: December 1, 2009

Publication date: June 2, 2011

Applicant: HONDA MOTOR CO., LTD

Inventors: Ritchie Huang, Stuart M. Yamamoto, David M. Kirsch
Systems And Methods For Synthesis Of Motion For Animation Of Virtual Heads/Characters Via Voice Processing In Portable Devices

Publication number: 20110131041

Abstract: Systems and methods consistent with the innovations herein relate to communication using a virtual humanoid animated during call processing. According to one exemplary implementation, the animation may be performed using a system of recognition of spoken vowels for animation of the lips, which may also be associated with the recognition of DTMF tones for animation of head movements and facial features. The innovations herein may be generally implemented in portable devices such as PDAs, cell phones and Smart Phones that have access to mobile telephony.

Type: Application

Filed: June 18, 2010

Publication date: June 2, 2011

Inventors: Paulo Cesar Cortez, Rodrigo Carvalho Souza Costa, Robson Da Silva Siqueira, Cincinato Furtado Leite Neto, Fabio Cisne Ribeiro, Francisco Jose Marques Anselmo, Raphael Torres Santos Carvalho, Antonio Carlos Da Silva Barros, Cesar Lincoln Cavalcante Mattos, Jose Marques Soares
COMPLEX ACOUSTIC RESONANCE SPEECH ANALYSIS SYSTEM

Publication number: 20110131039

Abstract: A method and apparatus are provided for determining an instantaneous frequency and an instantaneous bandwidth of a speech resonance of a speech signal. The method includes receiving a speech signal having a real component; filtering the speech signal so as to generate a plurality of filtered signals such that the real component and an imaginary component of the speech signal are reconstructed; and generating a first estimated frequency and a first estimated bandwidth of a speech resonance of the speech signal based on both a first filtered signal of the plurality of filtered signals and a single-lag delay of the first filtered signal.

Type: Application

Filed: December 1, 2009

Publication date: June 2, 2011

Inventors: John P. Kroeker, Janet Slifka, Richard S. McGowan
RATE-DISTORTION OPTIMIZATION FOR ADVANCED AUDIO CODING

Publication number: 20110125506

Abstract: A method for optimization of rate-distortion for Advanced Audio Coding (AAC). The method provides for the identification of quantized spectral coefficient sequences for optimization of rate-distortion. The method also provides joint optimization of scale factors, Huffman codebooks and quantized spectral coefficient sequences for minimization of a rate-distortion cost. The method provides an iterative rate-distortion optimization algorithm for AAC encoding. In each iteration, the method first finds the optimal scale factors and quantized spectral coefficients when Huffman codebooks are fixed, then updates Huffman codebooks and quantized spectral coefficients given the optimized scale factors. The iterations may be applied until a predetermined threshold is attained.

Type: Application

Filed: November 26, 2009

Publication date: May 26, 2011

Applicant: RESEARCH IN MOTION LIMITED

Inventors: Guixing WU, En-hui YANG, Longji WANG
Speech Processing and Learning

Publication number: 20110123965

Abstract: This invention relates to the field of tonal language speech signal processing. We describe a computer system for characterizing samples of a tonal language. These are analyzed to identify one or more vocal tract characterizing parameters of the user and synthesized speech data is generated by modifying a variation of fundamental frequency with time using a set of standard tones. The synthesized speech data represents the user speaking the tonal language with the modified fundamental frequency. Graphical feedback to guide the user can also be provided.

Type: Application

Filed: November 22, 2010

Publication date: May 26, 2011

Inventor: Kai Yu
Method of putting identification codes in a document

Publication number: 20110125502

Abstract: A method of putting identification codes in a document is disclosed. The method adds a speech-purpose print code in a document such that an OID pen can emit sound after the OID pen reads the speech-purpose print code. The software program first acquires the position of each word in the document and then automatically puts a speech-purpose print code corresponding to each word in the position of each word so that a user can rapidly generate a document with speech-purpose codes.

Type: Application

Filed: August 30, 2010

Publication date: May 26, 2011

Applicant: KUO-PING YANG

Inventors: Mardianto Soebagio Hadiputro, Kun-Yi Hua, Hwa-Pey Wang, Chih-Kang Yang, Kuo-Ping Yang
Voice-recognition/voice-activated vehicle signal system

Publication number: 20110119062

Abstract: A control system is operable within a host vehicle to control the operation of signaling apparatus indicative of a driver intent to execute right, left or U-turn actions. The control system includes a voice recognition circuit for activating turn signal devices within the vehicle. In some embodiments, a wireless link facilitates aftermarket applications while in other embodiments original equipment manufacture is accommodated.

Type: Application

Filed: August 24, 2010

Publication date: May 19, 2011

Inventor: Jewel L. Dohan
METHOD AND SYSTEM FOR DIALOG ENHANCEMENT

Publication number: 20110119061

Abstract: A method and system for enhancing dialog determined by an audio input signal. In some embodiments the input signal is a stereo signal, and the system includes an analysis subsystem configured to analyze the stereo signal to generate filter control values, and a filtering subsystem including upmixing circuitry configured to upmix the input signal to generate a speech channel and non-speech channels and a peaking filter configured to filter the speech channel to enhance dialog while being steered by at least one of the control values. The filtering subsystem also includes ducking circuitry for attenuating the non-speech channels while being steered by at least some of the control values, and downmixing circuitry configured to combine outputs of the peaking filter and ducking circuitry to generate a filtered stereo output.

Type: Application

Filed: November 15, 2010

Publication date: May 19, 2011

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventor: Charles Phillip Brown
INFORMATION PROCESSING DEVICE AND INFORMATION PROCESSING METHOD

Publication number: 20110119046

Abstract: An example sentence selection unit selects an example sentence from a template database based on an instruction received by an input unit. A translation output unit causes a display unit to display the example sentence selected by the example sentence selection unit and a translation of the example sentence. In addition, the translation output unit causes the display unit to display a designation sign designating a variable section in association with the variable section of the example sentence selected by the example sentence selection unit. Further, when the input unit receives input of a character corresponding to the designation sign, the translation output unit causes the display unit to display word candidates that can replace the variable section corresponding to the input character.

Type: Application

Filed: July 23, 2009

Publication date: May 19, 2011

Inventors: Naoko Shinozaki, Toshiyuki Okunishi, Koichi Sugiyama
System and Method for Enhanced Television and Delivery of Enhanced Television Content

Publication number: 20110115977

Abstract: Provided is an enhanced television system and method including a television receiver in communication with a broadcast reception tuner. The television receiver is configured to receive and display video data from a video stream and enhancement data from the reception tuner. The video stream includes embedded base programming identification metadata, and the television receiver is further configured to extract a base identification tag from the embedded base programming identification metadata, and combine enhancement data received from the reception tuner associated with the base identification tag with the video stream. The video stream may then be displayed.

Type: Application

Filed: November 13, 2009

Publication date: May 19, 2011

Applicant: Triveni Digital

Inventors: Mark Simpson, Richard Chernock
Noise suppression

Publication number: 20110112831

Abstract: A method and computing system for suppressing noise in an audio signal, comprising: receiving the audio signal at signal processing means; determining that another signal is input to the signal processing means, the input signal resulting from an activity which generates noise in the audio signal; and selectively suppressing noise in the audio signal in dependence on the determination that the input signal is input to the signal processing means to thereby suppress the generated noise in the audio signal.

Type: Application

Filed: June 23, 2010

Publication date: May 12, 2011

Applicant: Skype Limited

Inventors: Karsten Vandborg Sorensen, Jon Bergenheim, Koen Vos
Voice Actions on Computing Devices

Publication number: 20110106534

Abstract: A computer-implemented method includes receiving spoken input at a computing device from a user of the computing device, the spoken input including a carrier phrase and a subject to which the carrier phrase is directed, providing at least a portion of the spoken input to a server system in audio form for speech-to-text conversion by the server system, the portion including the subject to which the carrier phrase is directed, receiving from the server system instructions for automatically performing an operation on the computing device, the operation including an action defined by the carrier phrase using parameters defined by the subject, and automatically performing the operation on the computing device.

Type: Application

Filed: October 28, 2010

Publication date: May 5, 2011

Inventors: Michael J. LeBeau, John Nicholas Jitkoff
AUDIO SIGNAL COMPRESSION DEVICE, AUDIO SIGNAL COMPRESSION METHOD, AUDIO SIGNAL DEMODULATION DEVICE, AND AUDIO SIGNAL DEMODULATION METHOD

Publication number: 20110106547

Abstract: When encoding an audio signal, it is possible to efficiently encode the audio signal while maintaining high register signal components, and prevent deterioration of sound quality of decoded signal. A digital audio signal is divided into a plurality of frequency bands. The digital audio signal having been divided into each band is function-approximated for each divided band. Further, parameters of function having been function-approximated are encoded. When performing decoding process, parameters of the function of each band are used to perform function interpolation, synthesize the function-interpolated signal of each band interpolated, and decode the signal. Thus, when function-approximating each band, by suitably setting the function equation, it is possible to perform an encoding process while maintaining the high register components and perform a compression-coding process which enables reproduction with very good sound quality.

Type: Application

Filed: June 3, 2009

Publication date: May 5, 2011

Applicant: Japan Science and Technology Agency

Inventors: Kazuo Toraichi, Mitsuteru Nakamura, Yasuo Morooka
NETWORK/PEER ASSISTED SPEECH CODING

Publication number: 20110099009

Abstract: A communications network is used to transfer user attribute information about participants in a communication session to their respective communication terminals for storage and use thereon to configure a speech codec to operate in a speaker-dependent manner, thereby improving speech coding efficiency. In a network-assisted model, the user attribute information is stored on the communications network and selectively transmitted to the communication terminals while in a peer-assisted model, the user attribute information is derived by and transferred between communication terminals.

Type: Application

Filed: October 11, 2010

Publication date: April 28, 2011

Applicant: BROADCOM CORPORATION

Inventors: Robert W. Zopf, Kelly Hale
GESTURE-INITIATED REMOTE CONTROL PROGRAMMING

Publication number: 20110095873

Abstract: A method and system for configuring a universal remote control (URC) to control a remote-controlled device includes establishing a communication link between the URC and the remote-controlled device in response to detecting a gesture motion of the URC. Device information may be received from the remote-controlled device and used by the URC to program the URC to control the remote-controlled device. The URC may be configured to control a plurality of remote-controlled devices. The communication link may be a near field wireless communication link.

Type: Application

Filed: October 26, 2009

Publication date: April 28, 2011

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: James Pratt, Marc Sullivan
USER ATTRIBUTE DERIVATION AND UPDATE FOR NETWORK/PEER ASSISTED SPEECH CODING

Publication number: 20110099015

Abstract: Systems, methods and apparatuses are described for deriving and updating user attribute information about users of a communications system. A communications network is then used to transfer the user attribute information to communication terminals, which use the user attribute information to configure a speech codec to operate in a speaker-dependent manner during a communication session, thereby improving speech coding efficiency. In a network-assisted model, the user attribute information is stored on the communications network and selectively transmitted to the communication terminals while in a peer-assisted model, the user attribute information is derived by and transferred between communication terminals.

Type: Application

Filed: September 21, 2010

Publication date: April 28, 2011

Applicant: BROADCOM CORPORATION

Inventor: Robert W. Zopf
Computer-to-Computer Communications

Publication number: 20110099157

Abstract: A computer-implemented method for information sharing between computers includes receiving at a computer system a search request from a first computer, generating with the computer system one or more search results that are responsive to the first computer, formatting the results for display on a second computer that is different than the first computer, and automatically providing the results for display on the second computer.

Type: Application

Filed: October 28, 2010

Publication date: April 28, 2011

Inventors: Michael J. LeBeau, John Nicholas Jitkoff
DETERMINING AN UPPERBAND SIGNAL FROM A NARROWBAND SIGNAL

Publication number: 20110099004

Abstract: A method for determining an upperband speech signal from a narrowband speech signal is disclosed. A list of narrowband line spectral frequencies (LSFs) is determined from the narrowband speech signal. A first pair of adjacent narrowband LSFs that have a lower difference between them than every other pair of adjacent narrowband LSFs in the list is determined. A first feature that is a mean of the first pair of adjacent narrowband LSFs is determined. Upperband LSFs are determined based on at least the first feature using codebook mapping.

Type: Application

Filed: October 22, 2010

Publication date: April 28, 2011

Applicant: QUALCOMM Incorporated

Inventors: Venkatesh Krishnan, Daniel J. Sinder, Ananthapadmanabhan Arasanipalai Kandhadai
REPLACING AN AUDIO PORTION

Publication number: 20110093270

Abstract: A method includes identifying a first syllable in a first audio of a first word and a second syllable in a second audio of a second word, the first syllable having a first set of properties and the second syllable having a second set of properties; detecting the first syllable in a first instance of the first word in an audio file, the first syllable in the first instance having a third set of properties; determining one or more transformations for transforming the first set of properties to the third set of properties; applying the one or more transformations to the second set of properties of the second syllable to yield a transformed second syllable; and replacing the first syllable in the first instance of the first word with the transformed second syllable in the audio file.

Type: Application

Filed: October 16, 2009

Publication date: April 21, 2011

Applicant: Yahoo! Inc.

Inventor: Narayan Lakshmi BHAMIDIPATI
Devices, Systems and Methods for Improving and Adjusting Communication

Publication number: 20110082698

Abstract: Devices, methods and systems for improving and adjusting voice volume and body movements during a performance are disclosed. Device embodiments may be configured with a processor, microphone, one or more movement sensors and at least a display or a speaker. The processor may include instructions configured to receive at least one of sound input from the microphone and movement data from the one or more accelerometers, generate one or more input levels corresponding to at least one of the sound input and movement data, compare the one or more generated input levels to one or more predefined input levels, associate the one or more predefined input levels with at least one of a color, text, graphic or audio file and present at least one of the color, text, graphic or audio file to a user of the device.

Type: Application

Filed: October 1, 2010

Publication date: April 7, 2011

Inventor: Zev Rosenthal
Enabling Spoken Tags

Publication number: 20110077941

Abstract: Techniques for assigning a spoken tag in a telecom web platform are provided. The techniques include receiving a spoken tag, comparing the spoken tag to a set of one or more template tags, if the spoken tag is a match to a template tag, assigning the spoken tag and updating frequency of the tag in the set of one or more template tags, and if the spoken tag is not a match to a template tag, assigning the spoken tag and registering the spoken tag as a new tag in the set of one or more template tags.

Type: Application

Filed: September 30, 2009

Publication date: March 31, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Kuntal Dey, Anupam Jain, Arun Kumar, Natwar Modani, Amit Anil Nanavati, Nitendra Rajput
SPEECH RECOGNITION MODULE AND APPLICATIONS THEREOF

Publication number: 20110077944

Abstract: A speech recognition module includes an acoustic front-end module, a sound detection module, and a word detection module. The acoustic front-end module generates a plurality of representations of frames from a digital audio signal and generates speech characteristic probabilities for the plurality of frames. The sound detection module determines a plurality of estimated utterances from the plurality of representations and the speech characteristic probabilities. The word detection module determines one or more words based on the plurality of estimated utterances and the speech characteristics probabilities.

Type: Application

Filed: November 30, 2009

Publication date: March 31, 2011

Applicant: BROADCOM CORPORATION

Inventor: Nambirajan Seshadri
DERIVING GEOGRAPHIC DISTRIBUTION OF PHYSIOLOGICAL OR PSYCHOLOGICAL CONDITIONS OF HUMAN SPEAKERS WHILE PRESERVING PERSONAL PRIVACY

Publication number: 20110077946

Abstract: A method including: obtaining, via a plurality of communication devices, a plurality of speech signals respectively associated with human speakers, the speech signals including verbal components and non-verbal components; identifying a plurality of geographical locations, each geographic location associated with a respective one of the plurality of the communication devices; extracting the non-verbal components from the obtained speech signals; deducing physiological or psychological conditions of the human speakers by analyzing, over a specified period, the extracted non-verbal components, using predefined relations between characteristics of the non-verbal components and physiological or psychological conditions of the human speakers; and providing a geographical distribution of the deduced physiological or psychological conditions of the human speakers by associating the deduced physiological or psychological conditions of the human speakers with geographical locations thereof.

Type: Application

Filed: September 30, 2009

Publication date: March 31, 2011

Applicant: International Business Machines Corporation

Inventors: Slava Shectman, Raphael Steinberg
Audio Signal Correction Apparatus and Audio Signal Correction Method

Publication number: 20110071837

Abstract: According to one embodiment, an audio signal correction apparatus has a characteristic extraction module configured to determine whether an input audio signal is a monaural signal or a stereo signal, on the basis of channel information, and to extract a plurality of characteristic parameters for determining whether the input audio signal is a speech signal or a music signal, a signal type determination module configured to calculate a speech/music discrimination score which indicates whether the input audio signal is close to the speech signal or the music signal, on the basis of the plurality of characteristic parameters and a level calculation module configured to calculate, with use of the speech/music discrimination score, output levels of a degree of speech and a degree of music.

Type: Application

Filed: May 3, 2010

Publication date: March 24, 2011

Inventors: Hiroshi Yonekubo, Hirokazu Takeuchi
RECEIVER INTELLIGIBILITY ENHANCEMENT SYSTEM

Publication number: 20110071821

Abstract: Embodiments of the invention provide a communication device and methods for enhancing audio signals. A first audio signal buffer and a second audio signal buffer are acquired. Thereafter, the second audio signal is processed based on the linear predictive coding coefficients and gains based on noise power of the first audio signal to generate an enhanced second audio signal.

Type: Application

Filed: November 15, 2010

Publication date: March 24, 2011

Inventors: Alon Konchitsky, Sandeep Kulakcherla, Alberto D. Berstein
Method for Speech Recognition on All Languages and for Inputing words using Speech Recognition

Publication number: 20110066434

Abstract: The invention can recognize all languages and input words. It needs m unknown voices to represent m categories of known words with similar pronunciations. Words can be pronounced in any languages, dialects or accents. Each will be classified into one of m categories represented by its most similar unknown voice. When user pronounces a word, the invention finds its F most similar unknown voices. All words in F categories represented by F unknown voices will be arranged according to their pronunciation similarity and alphabetic letters. The pronounced word should be among the top words. Since we only find the F most similar unknown voices from m (=500) unknown voices and since the same word can be classified into several categories, our recognition method is stable for all users and can fast and accurately recognize all languages (English, Chinese and etc.) and input much more words without using samples.

Type: Application

Filed: September 29, 2009

Publication date: March 17, 2011

Inventors: Tze-Fen LI, Tai-Jan Lee Li, Shih-Tzung Li, Shih-Hon Li, Li-Chuan Liao
AUDIO SIGNAL ENCODING EMPLOYING INTERCHANNEL AND TEMPORAL REDUNDANCY REDUCTION

Publication number: 20110066440

Abstract: A method of encoding a time-domain audio signal is presented. A device transforms the time-domain signal into a frequency-domain signal including a sequence of sample blocks, wherein each block includes a coefficient for each of multiple frequencies. The coefficients of each block are grouped into frequency bands. For each frequency band of each block, a scale factor is estimated for the band, and the energy of the band for the block is compared with the energy of the band of an adjacent sample block, wherein the blocks may be adjacent to each other in either or both of an interchannel and a temporal sense. If the ratio of the band energy for the first block to the band energy for the adjacent block is less than some value, the scale factor of the band for the first block is increased. The coefficients of the band for each block are quantized based on the resulting scale factor. The encoded audio signal is generated based on the quantized coefficients and the scale factors.

Type: Application

Filed: September 11, 2009

Publication date: March 17, 2011

Applicant: SLING MEDIA PVT LTD

Inventor: Nandury V. Kishore
MEDIA CONTROL

Publication number: 20110067059

Abstract: Systems and methods to control media are disclosed. A particular method includes receiving a speech input at a mobile communications device. The speech input is processed to generate audio data. The audio data is sent, via a mobile data network, to a first server. The first server processes the audio data to generate text based on the audio data. Data related to the text is received from the first server. One or more commands are sent to a second server via the mobile data network. In response to the one or more commands, the second server sends control signals based on the one or more commands to a media controller. The control signals cause the media controller to control multimedia content displayed via a display device.

Type: Application

Filed: December 22, 2009

Publication date: March 17, 2011

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Michael Johnston, Hisao M. Chang, Giuseppe Di Fabbrizio, Thomas Okken, Bernard S. Renger
AUTOMATIC TRANSLATION SYSTEM BASED ON STRUCTURED TRANSLATION MEMORY AND AUTOMATIC TRANSLATION METHOD USING THE SAME

Publication number: 20110060583

Abstract: Provided are an automatic translation system based on structured translation memory and an automatic translation method using the same. In the automatic translation system, a translation memory establishment module changes a predetermined language pattern into a part translation pattern and registers the changed part translation pattern in a structured translation memory. A sentence unit translation module performs a translation of the sentence unit on an input sentence on the basis of the translation memory. A part combination translation module analyzes a structure of a language pattern less than the sentence unit which is included in the input sentence, searches the registered part translation pattern which is matched with the analyzed language pattern on the basis of the translation memory, and combines the searched part translation pattern to output a translation corresponding to the input sentence.

Type: Application

Filed: December 23, 2009

Publication date: March 10, 2011

Applicant: Electronics and Telecommunications Research Institute

Inventors: Sung Kwon CHOI, Ki Young Lee, Yoon Hyung Roh, Oh Woog Kwon, Chang Hyun Kim, Young Ae Seo, Seong II Yang, Yun Jin, Jinxia Huang, Yingshun Wu, Changhao Yin, Eun Jin Park, Young Kil Kim, Sang Kyu Park
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, IMAGE FORMING APPARATUS, AND INFORMATION PROCESSING METHOD

Publication number: 20110058189

Abstract: According to an aspect of the invention, an information processing apparatus includes a process instruction receiving module, a storage module, and a process state output module. The storage module receives a process instruction from a user. The storage module stores the process instruction and language attribute information to associate with each other, the language attribute information designating a language for outputting a process state of the process instruction. The process state output module outputs information of the process state in the language designated by the language attribute information associated with the process instruction.

Type: Application

Filed: February 25, 2010

Publication date: March 10, 2011

Applicant: FUJI XEROX CO., LTD.

Inventor: Michio KUWAMURA
ADAPTIVE GROUPING OF PARAMETERS FOR ENHANCED CODING EFFICIENCY

Publication number: 20110060598

Abstract: The present invention is based on the finding that parameters including: a first set of parameters of a representation of a first portion of an original signal and a second set of parameters of a representation of a second portion of the original signal can be efficiently encoded when the parameters are arranged in a first sequence of tuples and a second sequence of tuples. The first sequence of tuples includes tuples of parameters having two parameters from a single portion of the original signal and the second sequence of tuples includes tuples of parameters having one parameter from the first portion and one parameter from the second portion of the original signal. A bit estimator estimates the number of necessary bits to encode the first and the second sequence of tuples. Only the sequence of tuples, which results in the lower number of bits, is encoded.

Type: Application

Filed: November 17, 2010

Publication date: March 10, 2011

Applicant: FRAUNHOFER-GESELLSCHAFT ZUR FORDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

Inventors: RALPH SPERSCHNEIDER, JÜRGEN HERRE, KARSTEN LINZMEIER, JOHANNES HILPERT
METHOD AND APPARATUS FOR PROCESSING AUDIO SIGNALS

Publication number: 20110060599

Abstract: Methods and apparatuses for encoding and decoding an audio signal are provided, a method of encoding an audio signal including: receiving the audio signal including information about a moving sound source; receiving position information about the moving sound source; generating dynamic track information indicating motion of the moving sound source by using the position information; and encoding the audio signal and the dynamic track information.

Type: Application

Filed: April 16, 2009

Publication date: March 10, 2011

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Hyun-Wook Kim, Chul-Woo Lee, Jong-Hoon Jeong, Nam-Suk Lee, Han-Gil Moon, Sang-Hoon Lee
CONVERSION FROM NOTE-BASED AUDIO FORMAT TO PCM-BASED AUDIO FORMAT

Publication number: 20110053655

Abstract: A system and method for converting a note-based audio object to a Pulse Code Modulated (PCM) audio format is disclosed. An electronic computer device includes a memory containing a note-based audio object and a lookup table, the note-based audio object containing note frequency information. A processor is configured for converting the note-based audio object to a Pulse Code Modulated (PCM) stream having a plurality of sample points, the converting including: generating a PCM value for each sample point based upon the note frequency and the trigonometric function evaluations, selecting an entry from the look-up table based upon the note frequency information, a sampling frequency, and a sample point number, and determining a step size within the look-up table based upon a ratio between the note frequency information and the sampling frequency.

Type: Application

Filed: November 10, 2010

Publication date: March 3, 2011

Applicant: RESEARCH IN MOTION LIMITED

Inventor: Rodney Bylsma
Apparatus and Method for Control Using a Humming Frequency

Publication number: 20110051557

Abstract: A device controller configured to control physical devices using an audible humming frequency that includes: a humming frequency module, a humming command module, and a control command module. The humming frequency module may be configured to determine humming frequenc(ies) using a detected humming signal(s). The humming command module may be configured to compute humming command(s) based on the humming frequenc(ies). The control command module may be configured to generate control command(s) using received key command(s) and humming command(s).

Type: Application

Filed: August 26, 2009

Publication date: March 3, 2011

Inventors: Nathalia Peixoto, Gregory Gutt, Hossein Ghaffari Nik
COMPUTING CIRCUITS AND METHOD FOR RUNNING AN MPEG-2 AAC OR MPEG-4 AAC AUDIO DECODING ALGORITHM ON PROGRAMMABLE PROCESSORS

Publication number: 20110054915

Abstract: The present invention relates to computing circuits and method for running an MPEG-2 AAC or MPEG-4 AAC algorithm efficiently, which is used as an audio compression algorithm in multi-channel high-quality audio systems, on programmable processors. In accordance with the present invention, the IMDCT process which takes large part of the amount of the operations in implementation of an MPEG-2/4 AAC algorithm can be performed in efficient. In addition, while the architecture of the existing digital signal processor is still used, the performance can be improved by means of the addition of the architecture of the address generator, Huffman decoder, and bit processing architecture. After all, to design and change the programmable processor is facilitated.

Type: Application

Filed: September 13, 2010

Publication date: March 3, 2011

Applicant: PULSUS TECHNOLOGIES

Inventors: Jong Hoon OH, Myung Hoon SUNWOO, Jong Ha MOON
IMAGE PROCESSING SYSTEM, IMAGE PROCESSING APPARATUS AND INFORMATION PROCESSING APPARATUS

Publication number: 20110054908

Abstract: An image processing system includes an information processing apparatus and an image processing apparatus connected to each other via a network. The information processing apparatus has an application installed thereon to give a new function to the image processing apparatus. The image processing apparatus transmits to the information processing apparatus, voice data obtained by a microphone of the image processing apparatus and data set via an operation screen customized according to the application. The information processing apparatus determines answer information indicating an action to be taken by the image processing apparatus, based on the received voice data, a dictionary owned by the application and the data set via the operation screen, and then transmits the determined answer information to the image processing apparatus. The image processing apparatus takes an action according to the answer information received therefrom.

Type: Application

Filed: August 20, 2010

Publication date: March 3, 2011

Applicant: KONICA MINOLTA BUSINESS TECHNOLOGIES, INC

Inventors: Hideyuki MATSUDA, Kazumi SAWAYANAGI, Toshihiko OTAKE, Masao HOSONO
AUTOMATICALLY PROVIDING CONTENT ASSOCIATED WITH CAPTURED INFORMATION, SUCH AS INFORMATION CAPTURED IN REAL-TIME

Publication number: 20110043652

Abstract: A system and method for automatically providing content associated with captured information is described. In some examples, the system receives input by a user, and automatically provides content or links to content associated with the input. In some examples, the system receives input via text entry or by capturing text from a rendered document, such as a printed document, an object, an audio stream, and so on.

Type: Application

Filed: March 12, 2010

Publication date: February 24, 2011

Inventors: Martin T. King, Redwood Stephens, Claes-Fredrik Mannby, Jesse Peterson, Mark Sanvitale, Michael J. Smith, Christopher J. Daley-Watson
FREQUENCY BAND SCALE FACTOR DETERMINATION IN AUDIO ENCODING BASED UPON FREQUENCY BAND SIGNAL ENERGY

Publication number: 20110046966

Abstract: A method of encoding a time-domain audio signal is presented. In the method, an electronic device receives the time-domain audio signal. The time-domain audio signal is transformed into a frequency-domain signal including a coefficient for each of a plurality of frequencies, which are grouped into frequency bands. For each frequency band, the energy of the band is determined, a scale factor for the band is determined based on the energy of the band, and the coefficients of the band are quantized based on the associated scale factor. The encoded audio signal is generated based on the quantized coefficients and the scale factors.

Type: Application

Filed: August 24, 2009

Publication date: February 24, 2011

Applicant: SLING MEDIA PVT LTD

Inventor: Laxminarayana M. Dalimba

prev … 3 4 5 6 7 8 9 10 11 … next