Modification Of At Least One Characteristic Of Speech Waves (epo) Patents (Class 704/E21.001)
  • Publication number: 20110172996
    Abstract: A voice input device, a method for manufacturing the same, and an information processing system are provided. The voice input device has a function of removing a noise component and includes a first microphone 710-1 that includes a first vibrating membrane, a second microphone 710-2 that includes a second vibrating membrane, and a differential signal generation section 720 that generates a differential signal that represents a difference between a first voltage signal and a second voltage signal. The first and second vibrating membranes are disposed so that a noise intensity ratio is smaller than an input voice intensity ratio that represents the ratio to intensity of an input voice component.
    Type: Application
    Filed: May 20, 2009
    Publication date: July 14, 2011
    Applicants: FUNAI ELECTRIC CO., LTD., FUNAI ELECTRIC ADVANCED APPLIED TECHNOLOGY RESEARCH INSTITUTE INC.
    Inventors: Rikuo Takano, Kiyoshi Sugiyama, Toshimi Fukuoka, Masatoshi Ono, Ryusuke Horibe, Fuminori Tanaka, Takeshi Inoda
  • Publication number: 20110164105
    Abstract: A handheld communication device is used to capture video streams and generate a multiplexed video stream. The handheld communication device has at least two cameras facing in two opposite directions. The handheld communication device receives a first video stream and a second video stream simultaneously from the two cameras. The handheld communication device detects a speech activity of a person captured in the video streams. The speech activity may be detected from direction of sound or lip movement of the person. Based on the detection, the handheld communication device automatically switches between the first video stream and the second video stream to generate a multiplexed video stream. The multiplexed video stream interleaves segments of the first video stream and segments of the second video stream. Other embodiments are also described and claimed.
    Type: Application
    Filed: January 6, 2010
    Publication date: July 7, 2011
    Applicant: Apple Inc.
    Inventors: Jae Han Lee, E-Cheng Chang
  • Publication number: 20110161087
    Abstract: A method for processing an audio signal including classifying an input frame as either a speech frame or a generic audio frame, producing an encoded bitstream and a corresponding processed frame based on the input frame, producing an enhancement layer encoded bitstream based on a difference between the input frame and the processed frame, and multiplexing the enhancement layer encoded bitstream, a codeword, and either a speech encoded bitstream or a generic audio encoded bitstream into a combined bitstream based on whether the codeword indicates that the input frame is classified as a speech frame or as a generic audio frame, wherein the encoded bitstream is either a speech encoded bitstream or a generic audio encoded bitstream.
    Type: Application
    Filed: December 31, 2009
    Publication date: June 30, 2011
    Applicant: Motorola, Inc.
    Inventors: James P. ASHLEY, Jonathan A. Gibbs, Udar Mittal
  • Publication number: 20110153321
    Abstract: Systems and methods for detecting features in spoken speech and processing speech sounds based on the features are provided. One or more features may be identified in a speech sound. The speech sound may be modified to enhance or reduce the degree to which the feature affects the sound ultimately heard by a listener. Systems and methods according to embodiments of the invention may allow for automatic speech recognition devices that enhance detection and recognition of spoken sounds, such as by a user of a hearing aid or other device.
    Type: Application
    Filed: July 2, 2009
    Publication date: June 23, 2011
    Applicant: THE BOARD OF TRUSTEES OF THE UNIVERSITY OF ILLINOI
    Inventors: Jont B. Allen, Feipeng LI
  • Publication number: 20110153332
    Abstract: A device for booting a handheld apparatus by voice control includes a base, a power-on device, a trigger switch, and an acoustic sensor. Upon the handheld apparatus being placed at the base to trigger the trigger switch, the trigger switch controls the power-on device to power on the handheld apparatus. After the handheld apparatus is powered on, the acoustic sensor detects a sound of the handheld apparatus and then controls a pressure head of the power-on device to move away. The device and its method for booting a handheld apparatus by voice control come with the advantages of a simple and easy operation and a high efficiency.
    Type: Application
    Filed: September 30, 2010
    Publication date: June 23, 2011
    Applicants: INVENTEC APPLIANCES (SHANGHAI) CO. LTD., INVENTEC APPLIANCES CORP.
    Inventor: Shang-Fei Hu
  • Publication number: 20110153315
    Abstract: Methods and apparatus for audio and speech processing including generating a plurality of frames, each of the frames comprising a plurality of transform coefficients, and allocating bits to the transform coefficients in each of the frames such that at least two of the transform coefficients in the same frame have different bit allocations and the total number of the bits allocated to the transform coefficients in at least two of the frames is equal.
    Type: Application
    Filed: February 2, 2010
    Publication date: June 23, 2011
    Applicant: QUALCOMM Incorporated
    Inventors: Somdeb Majumdar, Amin Fazeldehkordi, Harinath Garudadri
  • Publication number: 20110150227
    Abstract: Provided is a signal processing method which calculates a correlation coefficient indicating the degree of relation in a stereo signal and extracts a speech signal from the stereo signal by using the correlation coefficient and the stereo signal.
    Type: Application
    Filed: October 28, 2010
    Publication date: June 23, 2011
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: Sun-min KIM
  • Publication number: 20110153331
    Abstract: The present invention provides a method for generating voice signal in electronic books (E-books). The method includes the steps of: receiving a voice signal in response to a triggering signal for placing a bookmark; and displaying a functional icon of the bookmark corresponding to the voice signal in a region of the E-book. The present invention also provides a E-book reader, including: a display unit, a receiver unit, and a processing unit, wherein the receiver unit receivers a voice signal in response to a triggering signal for placing a bookmark, and the processing unit is used to display a functional icon of the bookmark corresponding to the voice signal in a region of the E-book.
    Type: Application
    Filed: May 13, 2010
    Publication date: June 23, 2011
    Applicant: INVENTEC APPLIANCES CORP.
    Inventors: Yuan-Hua DONG, Liang HUANG, Shih-Kuang TSAI
  • Publication number: 20110144998
    Abstract: An embedder for embedding a watermark to be embedded into an input information representation comprises an embedding parameter determiner that is implemented to apply a derivation function once or several times to an initial value to obtain an embedding parameter for embedding the watermark into the input information representation. Further, the embedder comprises a watermark adder that is implemented to provide the input information representation with the watermark using the embedding parameter. The embedder is implemented to select how many times the derivation function is to be applied to the initial value.
    Type: Application
    Filed: March 3, 2009
    Publication date: June 16, 2011
    Inventors: Bernhard Grill, Ernst Eberlein, Stefan Kraegeloh, Joerg Pickel, Juliane Borsum
  • Publication number: 20110137644
    Abstract: A method, terminal and program for processing a speech signal, in which the speech signal is received over a network from a transmitting device, wherein the frequency components in the received speech signal are limited to a predetermined frequency range and the received speech signal has been filtered using a transmitter frequency response over the predetermined frequency range. The received speech signal is decoded. The decoded speech signal is filtered using a receiver frequency response which is complementary to the transmitter frequency response over the predetermined frequency range to thereby reduce distortion in the speech signal introduced over the predetermined frequency range by using said transmitter frequency response.
    Type: Application
    Filed: October 6, 2010
    Publication date: June 9, 2011
    Applicant: Skype Limited
    Inventors: Mattias Nilsson, Stefan Strommer, Soren Vang Andersen
  • Publication number: 20110134910
    Abstract: A computer-implemented method and system of enabling concurrent real-time multi-language communication between multiple participants using a selective broadcast protocol, the method including receiving at a first server a real-time communication from a first participant, the real-time communication being addressed to a second participant constructed in a first spoken language. A preferred spoken language of receipt of real-time communication is identified by the second participant. A determination is made whether the preferred spoken language of receipt is different than that of the first spoken language of the real-time communication.
    Type: Application
    Filed: December 8, 2009
    Publication date: June 9, 2011
    Applicant: International Business Machines Corporation
    Inventors: Chi-Chuen Chao-Suren, Ezra D.B. Hall, Pascal A. Nsame, Ayidn Suren, Sebastien T. Ventrone
  • Publication number: 20110134204
    Abstract: A system and method for facilitating collaboration of a group. The system and method provide a ubiquitous anytime/everywhere environment realized through fixed and mobile technologies and scaffolded by group support software. The system includes a collaboration engine having an architecture that supports both generic collaborative processes along with task specific team processes instantiated through a sophisticated suite of advanced modular technologies. The collaboration engine drives dynamic and real time collaborative problem solving and decision making by integrating sensor and human data from the field with group support software that efficiently and effectively manages team interaction.
    Type: Application
    Filed: December 5, 2008
    Publication date: June 9, 2011
    Applicant: FLORIDA GULF COAST UNIVERSITY
    Inventors: Walter Rodriguez, Augusto Opdenbosch, Deborah S. Carstens, Brian Goldiez, Stephen M. Fiore, Veton Kepuska
  • Publication number: 20110131040
    Abstract: A method and an in-vehicle system having a speech recognition component are provided for improving speech recognition performance. The speech recognition component may have multiple vocabulary dictionaries, each of which may include phonetics associated with commands. When the in-vehicle system receives speech input, the speech recognition component may determine whether the received speech input includes a speech access command. If the received speech input is determined to include a speech access command, then a dictionary changing component may transition a currently-used dictionary of the speech recognition component to a vocabulary dictionary associated with the determined speech access command. Otherwise, the dictionary changing component may transition the currently-used dictionary to a first vocabulary dictionary. A command included in the received speech input may then be recognized by the speech recognition component using the transitioned currently-used dictionary.
    Type: Application
    Filed: December 1, 2009
    Publication date: June 2, 2011
    Applicant: HONDA MOTOR CO., LTD
    Inventors: Ritchie Huang, Stuart M. Yamamoto, David M. Kirsch
  • Publication number: 20110131041
    Abstract: Systems and methods consistent with the innovations herein relate to communication using a virtual humanoid animated during call processing. According to one exemplary implementation, the animation may be performed using a system of recognition of spoken vowels for animation of the lips, which may also be associated with the recognition of DTMF tones for animation of head movements and facial features. The innovations herein may be generally implemented in portable devices such as PDAs, cell phones and Smart Phones that have access to mobile telephony.
    Type: Application
    Filed: June 18, 2010
    Publication date: June 2, 2011
    Inventors: Paulo Cesar Cortez, Rodrigo Carvalho Souza Costa, Robson Da Silva Siqueira, Cincinato Furtado Leite Neto, Fabio Cisne Ribeiro, Francisco Jose Marques Anselmo, Raphael Torres Santos Carvalho, Antonio Carlos Da Silva Barros, Cesar Lincoln Cavalcante Mattos, Jose Marques Soares
  • Publication number: 20110131039
    Abstract: A method and apparatus are provided for determining an instantaneous frequency and an instantaneous bandwidth of a speech resonance of a speech signal. The method includes receiving a speech signal having a real component; filtering the speech signal so as to generate a plurality of filtered signals such that the real component and an imaginary component of the speech signal are reconstructed; and generating a first estimated frequency and a first estimated bandwidth of a speech resonance of the speech signal based on both a first filtered signal of the plurality of filtered signals and a single-lag delay of the first filtered signal.
    Type: Application
    Filed: December 1, 2009
    Publication date: June 2, 2011
    Inventors: John P. Kroeker, Janet Slifka, Richard S. McGowan
  • Publication number: 20110125506
    Abstract: A method for optimization of rate-distortion for Advanced Audio Coding (AAC). The method provides for the identification of quantized spectral coefficient sequences for optimization of rate-distortion. The method also provides joint optimization of scale factors, Huffman codebooks and quantized spectral coefficient sequences for minimization of a rate-distortion cost. The method provides an iterative rate-distortion optimization algorithm for AAC encoding. In each iteration, the method first finds the optimal scale factors and quantized spectral coefficients when Huffman codebooks are fixed, then updates Huffman codebooks and quantized spectral coefficients given the optimized scale factors. The iterations may be applied until a predetermined threshold is attained.
    Type: Application
    Filed: November 26, 2009
    Publication date: May 26, 2011
    Applicant: RESEARCH IN MOTION LIMITED
    Inventors: Guixing WU, En-hui YANG, Longji WANG
  • Publication number: 20110123965
    Abstract: This invention relates to the field of tonal language speech signal processing. We describe a computer system for characterizing samples of a tonal language. These are analyzed to identify one or more vocal tract characterizing parameters of the user and synthesized speech data is generated by modifying a variation of fundamental frequency with time using a set of standard tones. The synthesized speech data represents the user speaking the tonal language with the modified fundamental frequency. Graphical feedback to guide the user can also be provided.
    Type: Application
    Filed: November 22, 2010
    Publication date: May 26, 2011
    Inventor: Kai Yu
  • Publication number: 20110125502
    Abstract: A method of putting identification codes in a document is disclosed. The method adds a speech-purpose print code in a document such that an OID pen can emit sound after the OID pen reads the speech-purpose print code. The software program first acquires the position of each word in the document and then automatically puts a speech-purpose print code corresponding to each word in the position of each word so that a user can rapidly generate a document with speech-purpose codes.
    Type: Application
    Filed: August 30, 2010
    Publication date: May 26, 2011
    Applicant: KUO-PING YANG
    Inventors: Mardianto Soebagio Hadiputro, Kun-Yi Hua, Hwa-Pey Wang, Chih-Kang Yang, Kuo-Ping Yang
  • Publication number: 20110119062
    Abstract: A control system is operable within a host vehicle to control the operation of signaling apparatus indicative of a driver intent to execute right, left or U-turn actions. The control system includes a voice recognition circuit for activating turn signal devices within the vehicle. In some embodiments, a wireless link facilitates aftermarket applications while in other embodiments original equipment manufacture is accommodated.
    Type: Application
    Filed: August 24, 2010
    Publication date: May 19, 2011
    Inventor: Jewel L. Dohan
  • Publication number: 20110119061
    Abstract: A method and system for enhancing dialog determined by an audio input signal. In some embodiments the input signal is a stereo signal, and the system includes an analysis subsystem configured to analyze the stereo signal to generate filter control values, and a filtering subsystem including upmixing circuitry configured to upmix the input signal to generate a speech channel and non-speech channels and a peaking filter configured to filter the speech channel to enhance dialog while being steered by at least one of the control values. The filtering subsystem also includes ducking circuitry for attenuating the non-speech channels while being steered by at least some of the control values, and downmixing circuitry configured to combine outputs of the peaking filter and ducking circuitry to generate a filtered stereo output.
    Type: Application
    Filed: November 15, 2010
    Publication date: May 19, 2011
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: Charles Phillip Brown
  • Publication number: 20110119046
    Abstract: An example sentence selection unit selects an example sentence from a template database based on an instruction received by an input unit. A translation output unit causes a display unit to display the example sentence selected by the example sentence selection unit and a translation of the example sentence. In addition, the translation output unit causes the display unit to display a designation sign designating a variable section in association with the variable section of the example sentence selected by the example sentence selection unit. Further, when the input unit receives input of a character corresponding to the designation sign, the translation output unit causes the display unit to display word candidates that can replace the variable section corresponding to the input character.
    Type: Application
    Filed: July 23, 2009
    Publication date: May 19, 2011
    Inventors: Naoko Shinozaki, Toshiyuki Okunishi, Koichi Sugiyama
  • Publication number: 20110115977
    Abstract: Provided is an enhanced television system and method including a television receiver in communication with a broadcast reception tuner. The television receiver is configured to receive and display video data from a video stream and enhancement data from the reception tuner. The video stream includes embedded base programming identification metadata, and the television receiver is further configured to extract a base identification tag from the embedded base programming identification metadata, and combine enhancement data received from the reception tuner associated with the base identification tag with the video stream. The video stream may then be displayed.
    Type: Application
    Filed: November 13, 2009
    Publication date: May 19, 2011
    Applicant: Triveni Digital
    Inventors: Mark Simpson, Richard Chernock
  • Publication number: 20110112831
    Abstract: A method and computing system for suppressing noise in an audio signal, comprising: receiving the audio signal at signal processing means; determining that another signal is input to the signal processing means, the input signal resulting from an activity which generates noise in the audio signal; and selectively suppressing noise in the audio signal in dependence on the determination that the input signal is input to the signal processing means to thereby suppress the generated noise in the audio signal.
    Type: Application
    Filed: June 23, 2010
    Publication date: May 12, 2011
    Applicant: Skype Limited
    Inventors: Karsten Vandborg Sorensen, Jon Bergenheim, Koen Vos
  • Publication number: 20110106534
    Abstract: A computer-implemented method includes receiving spoken input at a computing device from a user of the computing device, the spoken input including a carrier phrase and a subject to which the carrier phrase is directed, providing at least a portion of the spoken input to a server system in audio form for speech-to-text conversion by the server system, the portion including the subject to which the carrier phrase is directed, receiving from the server system instructions for automatically performing an operation on the computing device, the operation including an action defined by the carrier phrase using parameters defined by the subject, and automatically performing the operation on the computing device.
    Type: Application
    Filed: October 28, 2010
    Publication date: May 5, 2011
    Inventors: Michael J. LeBeau, John Nicholas Jitkoff
  • Publication number: 20110106547
    Abstract: When encoding an audio signal, it is possible to efficiently encode the audio signal while maintaining high register signal components, and prevent deterioration of sound quality of decoded signal. A digital audio signal is divided into a plurality of frequency bands. The digital audio signal having been divided into each band is function-approximated for each divided band. Further, parameters of function having been function-approximated are encoded. When performing decoding process, parameters of the function of each band are used to perform function interpolation, synthesize the function-interpolated signal of each band interpolated, and decode the signal. Thus, when function-approximating each band, by suitably setting the function equation, it is possible to perform an encoding process while maintaining the high register components and perform a compression-coding process which enables reproduction with very good sound quality.
    Type: Application
    Filed: June 3, 2009
    Publication date: May 5, 2011
    Applicant: Japan Science and Technology Agency
    Inventors: Kazuo Toraichi, Mitsuteru Nakamura, Yasuo Morooka
  • Publication number: 20110099009
    Abstract: A communications network is used to transfer user attribute information about participants in a communication session to their respective communication terminals for storage and use thereon to configure a speech codec to operate in a speaker-dependent manner, thereby improving speech coding efficiency. In a network-assisted model, the user attribute information is stored on the communications network and selectively transmitted to the communication terminals while in a peer-assisted model, the user attribute information is derived by and transferred between communication terminals.
    Type: Application
    Filed: October 11, 2010
    Publication date: April 28, 2011
    Applicant: BROADCOM CORPORATION
    Inventors: Robert W. Zopf, Kelly Hale
  • Publication number: 20110095873
    Abstract: A method and system for configuring a universal remote control (URC) to control a remote-controlled device includes establishing a communication link between the URC and the remote-controlled device in response to detecting a gesture motion of the URC. Device information may be received from the remote-controlled device and used by the URC to program the URC to control the remote-controlled device. The URC may be configured to control a plurality of remote-controlled devices. The communication link may be a near field wireless communication link.
    Type: Application
    Filed: October 26, 2009
    Publication date: April 28, 2011
    Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: James Pratt, Marc Sullivan
  • Publication number: 20110099015
    Abstract: Systems, methods and apparatuses are described for deriving and updating user attribute information about users of a communications system. A communications network is then used to transfer the user attribute information to communication terminals, which use the user attribute information to configure a speech codec to operate in a speaker-dependent manner during a communication session, thereby improving speech coding efficiency. In a network-assisted model, the user attribute information is stored on the communications network and selectively transmitted to the communication terminals while in a peer-assisted model, the user attribute information is derived by and transferred between communication terminals.
    Type: Application
    Filed: September 21, 2010
    Publication date: April 28, 2011
    Applicant: BROADCOM CORPORATION
    Inventor: Robert W. Zopf
  • Publication number: 20110099157
    Abstract: A computer-implemented method for information sharing between computers includes receiving at a computer system a search request from a first computer, generating with the computer system one or more search results that are responsive to the first computer, formatting the results for display on a second computer that is different than the first computer, and automatically providing the results for display on the second computer.
    Type: Application
    Filed: October 28, 2010
    Publication date: April 28, 2011
    Inventors: Michael J. LeBeau, John Nicholas Jitkoff
  • Publication number: 20110099004
    Abstract: A method for determining an upperband speech signal from a narrowband speech signal is disclosed. A list of narrowband line spectral frequencies (LSFs) is determined from the narrowband speech signal. A first pair of adjacent narrowband LSFs that have a lower difference between them than every other pair of adjacent narrowband LSFs in the list is determined. A first feature that is a mean of the first pair of adjacent narrowband LSFs is determined. Upperband LSFs are determined based on at least the first feature using codebook mapping.
    Type: Application
    Filed: October 22, 2010
    Publication date: April 28, 2011
    Applicant: QUALCOMM Incorporated
    Inventors: Venkatesh Krishnan, Daniel J. Sinder, Ananthapadmanabhan Arasanipalai Kandhadai
  • Publication number: 20110093270
    Abstract: A method includes identifying a first syllable in a first audio of a first word and a second syllable in a second audio of a second word, the first syllable having a first set of properties and the second syllable having a second set of properties; detecting the first syllable in a first instance of the first word in an audio file, the first syllable in the first instance having a third set of properties; determining one or more transformations for transforming the first set of properties to the third set of properties; applying the one or more transformations to the second set of properties of the second syllable to yield a transformed second syllable; and replacing the first syllable in the first instance of the first word with the transformed second syllable in the audio file.
    Type: Application
    Filed: October 16, 2009
    Publication date: April 21, 2011
    Applicant: Yahoo! Inc.
    Inventor: Narayan Lakshmi BHAMIDIPATI
  • Publication number: 20110082698
    Abstract: Devices, methods and systems for improving and adjusting voice volume and body movements during a performance are disclosed. Device embodiments may be configured with a processor, microphone, one or more movement sensors and at least a display or a speaker. The processor may include instructions configured to receive at least one of sound input from the microphone and movement data from the one or more accelerometers, generate one or more input levels corresponding to at least one of the sound input and movement data, compare the one or more generated input levels to one or more predefined input levels, associate the one or more predefined input levels with at least one of a color, text, graphic or audio file and present at least one of the color, text, graphic or audio file to a user of the device.
    Type: Application
    Filed: October 1, 2010
    Publication date: April 7, 2011
    Inventor: Zev Rosenthal
  • Publication number: 20110077941
    Abstract: Techniques for assigning a spoken tag in a telecom web platform are provided. The techniques include receiving a spoken tag, comparing the spoken tag to a set of one or more template tags, if the spoken tag is a match to a template tag, assigning the spoken tag and updating frequency of the tag in the set of one or more template tags, and if the spoken tag is not a match to a template tag, assigning the spoken tag and registering the spoken tag as a new tag in the set of one or more template tags.
    Type: Application
    Filed: September 30, 2009
    Publication date: March 31, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kuntal Dey, Anupam Jain, Arun Kumar, Natwar Modani, Amit Anil Nanavati, Nitendra Rajput
  • Publication number: 20110077944
    Abstract: A speech recognition module includes an acoustic front-end module, a sound detection module, and a word detection module. The acoustic front-end module generates a plurality of representations of frames from a digital audio signal and generates speech characteristic probabilities for the plurality of frames. The sound detection module determines a plurality of estimated utterances from the plurality of representations and the speech characteristic probabilities. The word detection module determines one or more words based on the plurality of estimated utterances and the speech characteristics probabilities.
    Type: Application
    Filed: November 30, 2009
    Publication date: March 31, 2011
    Applicant: BROADCOM CORPORATION
    Inventor: Nambirajan Seshadri
  • Publication number: 20110077946
    Abstract: A method including: obtaining, via a plurality of communication devices, a plurality of speech signals respectively associated with human speakers, the speech signals including verbal components and non-verbal components; identifying a plurality of geographical locations, each geographic location associated with a respective one of the plurality of the communication devices; extracting the non-verbal components from the obtained speech signals; deducing physiological or psychological conditions of the human speakers by analyzing, over a specified period, the extracted non-verbal components, using predefined relations between characteristics of the non-verbal components and physiological or psychological conditions of the human speakers; and providing a geographical distribution of the deduced physiological or psychological conditions of the human speakers by associating the deduced physiological or psychological conditions of the human speakers with geographical locations thereof.
    Type: Application
    Filed: September 30, 2009
    Publication date: March 31, 2011
    Applicant: International Business Machines Corporation
    Inventors: Slava Shectman, Raphael Steinberg
  • Publication number: 20110071837
    Abstract: According to one embodiment, an audio signal correction apparatus has a characteristic extraction module configured to determine whether an input audio signal is a monaural signal or a stereo signal, on the basis of channel information, and to extract a plurality of characteristic parameters for determining whether the input audio signal is a speech signal or a music signal, a signal type determination module configured to calculate a speech/music discrimination score which indicates whether the input audio signal is close to the speech signal or the music signal, on the basis of the plurality of characteristic parameters and a level calculation module configured to calculate, with use of the speech/music discrimination score, output levels of a degree of speech and a degree of music.
    Type: Application
    Filed: May 3, 2010
    Publication date: March 24, 2011
    Inventors: Hiroshi Yonekubo, Hirokazu Takeuchi
  • Publication number: 20110071821
    Abstract: Embodiments of the invention provide a communication device and methods for enhancing audio signals. A first audio signal buffer and a second audio signal buffer are acquired. Thereafter, the second audio signal is processed based on the linear predictive coding coefficients and gains based on noise power of the first audio signal to generate an enhanced second audio signal.
    Type: Application
    Filed: November 15, 2010
    Publication date: March 24, 2011
    Inventors: Alon Konchitsky, Sandeep Kulakcherla, Alberto D. Berstein
  • Publication number: 20110066434
    Abstract: The invention can recognize all languages and input words. It needs m unknown voices to represent m categories of known words with similar pronunciations. Words can be pronounced in any languages, dialects or accents. Each will be classified into one of m categories represented by its most similar unknown voice. When user pronounces a word, the invention finds its F most similar unknown voices. All words in F categories represented by F unknown voices will be arranged according to their pronunciation similarity and alphabetic letters. The pronounced word should be among the top words. Since we only find the F most similar unknown voices from m (=500) unknown voices and since the same word can be classified into several categories, our recognition method is stable for all users and can fast and accurately recognize all languages (English, Chinese and etc.) and input much more words without using samples.
    Type: Application
    Filed: September 29, 2009
    Publication date: March 17, 2011
    Inventors: Tze-Fen LI, Tai-Jan Lee Li, Shih-Tzung Li, Shih-Hon Li, Li-Chuan Liao
  • Publication number: 20110066440
    Abstract: A method of encoding a time-domain audio signal is presented. A device transforms the time-domain signal into a frequency-domain signal including a sequence of sample blocks, wherein each block includes a coefficient for each of multiple frequencies. The coefficients of each block are grouped into frequency bands. For each frequency band of each block, a scale factor is estimated for the band, and the energy of the band for the block is compared with the energy of the band of an adjacent sample block, wherein the blocks may be adjacent to each other in either or both of an interchannel and a temporal sense. If the ratio of the band energy for the first block to the band energy for the adjacent block is less than some value, the scale factor of the band for the first block is increased. The coefficients of the band for each block are quantized based on the resulting scale factor. The encoded audio signal is generated based on the quantized coefficients and the scale factors.
    Type: Application
    Filed: September 11, 2009
    Publication date: March 17, 2011
    Applicant: SLING MEDIA PVT LTD
    Inventor: Nandury V. Kishore
  • Publication number: 20110067059
    Abstract: Systems and methods to control media are disclosed. A particular method includes receiving a speech input at a mobile communications device. The speech input is processed to generate audio data. The audio data is sent, via a mobile data network, to a first server. The first server processes the audio data to generate text based on the audio data. Data related to the text is received from the first server. One or more commands are sent to a second server via the mobile data network. In response to the one or more commands, the second server sends control signals based on the one or more commands to a media controller. The control signals cause the media controller to control multimedia content displayed via a display device.
    Type: Application
    Filed: December 22, 2009
    Publication date: March 17, 2011
    Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Michael Johnston, Hisao M. Chang, Giuseppe Di Fabbrizio, Thomas Okken, Bernard S. Renger
  • Publication number: 20110060583
    Abstract: Provided are an automatic translation system based on structured translation memory and an automatic translation method using the same. In the automatic translation system, a translation memory establishment module changes a predetermined language pattern into a part translation pattern and registers the changed part translation pattern in a structured translation memory. A sentence unit translation module performs a translation of the sentence unit on an input sentence on the basis of the translation memory. A part combination translation module analyzes a structure of a language pattern less than the sentence unit which is included in the input sentence, searches the registered part translation pattern which is matched with the analyzed language pattern on the basis of the translation memory, and combines the searched part translation pattern to output a translation corresponding to the input sentence.
    Type: Application
    Filed: December 23, 2009
    Publication date: March 10, 2011
    Applicant: Electronics and Telecommunications Research Institute
    Inventors: Sung Kwon CHOI, Ki Young Lee, Yoon Hyung Roh, Oh Woog Kwon, Chang Hyun Kim, Young Ae Seo, Seong II Yang, Yun Jin, Jinxia Huang, Yingshun Wu, Changhao Yin, Eun Jin Park, Young Kil Kim, Sang Kyu Park
  • Publication number: 20110058189
    Abstract: According to an aspect of the invention, an information processing apparatus includes a process instruction receiving module, a storage module, and a process state output module. The storage module receives a process instruction from a user. The storage module stores the process instruction and language attribute information to associate with each other, the language attribute information designating a language for outputting a process state of the process instruction. The process state output module outputs information of the process state in the language designated by the language attribute information associated with the process instruction.
    Type: Application
    Filed: February 25, 2010
    Publication date: March 10, 2011
    Applicant: FUJI XEROX CO., LTD.
    Inventor: Michio KUWAMURA
  • Publication number: 20110060598
    Abstract: The present invention is based on the finding that parameters including: a first set of parameters of a representation of a first portion of an original signal and a second set of parameters of a representation of a second portion of the original signal can be efficiently encoded when the parameters are arranged in a first sequence of tuples and a second sequence of tuples. The first sequence of tuples includes tuples of parameters having two parameters from a single portion of the original signal and the second sequence of tuples includes tuples of parameters having one parameter from the first portion and one parameter from the second portion of the original signal. A bit estimator estimates the number of necessary bits to encode the first and the second sequence of tuples. Only the sequence of tuples, which results in the lower number of bits, is encoded.
    Type: Application
    Filed: November 17, 2010
    Publication date: March 10, 2011
    Applicant: FRAUNHOFER-GESELLSCHAFT ZUR FORDERUNG DER ANGEWANDTEN FORSCHUNG E.V.
    Inventors: RALPH SPERSCHNEIDER, JÜRGEN HERRE, KARSTEN LINZMEIER, JOHANNES HILPERT
  • Publication number: 20110060599
    Abstract: Methods and apparatuses for encoding and decoding an audio signal are provided, a method of encoding an audio signal including: receiving the audio signal including information about a moving sound source; receiving position information about the moving sound source; generating dynamic track information indicating motion of the moving sound source by using the position information; and encoding the audio signal and the dynamic track information.
    Type: Application
    Filed: April 16, 2009
    Publication date: March 10, 2011
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Hyun-Wook Kim, Chul-Woo Lee, Jong-Hoon Jeong, Nam-Suk Lee, Han-Gil Moon, Sang-Hoon Lee
  • Publication number: 20110053655
    Abstract: A system and method for converting a note-based audio object to a Pulse Code Modulated (PCM) audio format is disclosed. An electronic computer device includes a memory containing a note-based audio object and a lookup table, the note-based audio object containing note frequency information. A processor is configured for converting the note-based audio object to a Pulse Code Modulated (PCM) stream having a plurality of sample points, the converting including: generating a PCM value for each sample point based upon the note frequency and the trigonometric function evaluations, selecting an entry from the look-up table based upon the note frequency information, a sampling frequency, and a sample point number, and determining a step size within the look-up table based upon a ratio between the note frequency information and the sampling frequency.
    Type: Application
    Filed: November 10, 2010
    Publication date: March 3, 2011
    Applicant: RESEARCH IN MOTION LIMITED
    Inventor: Rodney Bylsma
  • Publication number: 20110051557
    Abstract: A device controller configured to control physical devices using an audible humming frequency that includes: a humming frequency module, a humming command module, and a control command module. The humming frequency module may be configured to determine humming frequenc(ies) using a detected humming signal(s). The humming command module may be configured to compute humming command(s) based on the humming frequenc(ies). The control command module may be configured to generate control command(s) using received key command(s) and humming command(s).
    Type: Application
    Filed: August 26, 2009
    Publication date: March 3, 2011
    Inventors: Nathalia Peixoto, Gregory Gutt, Hossein Ghaffari Nik
  • Publication number: 20110054915
    Abstract: The present invention relates to computing circuits and method for running an MPEG-2 AAC or MPEG-4 AAC algorithm efficiently, which is used as an audio compression algorithm in multi-channel high-quality audio systems, on programmable processors. In accordance with the present invention, the IMDCT process which takes large part of the amount of the operations in implementation of an MPEG-2/4 AAC algorithm can be performed in efficient. In addition, while the architecture of the existing digital signal processor is still used, the performance can be improved by means of the addition of the architecture of the address generator, Huffman decoder, and bit processing architecture. After all, to design and change the programmable processor is facilitated.
    Type: Application
    Filed: September 13, 2010
    Publication date: March 3, 2011
    Applicant: PULSUS TECHNOLOGIES
    Inventors: Jong Hoon OH, Myung Hoon SUNWOO, Jong Ha MOON
  • Publication number: 20110054908
    Abstract: An image processing system includes an information processing apparatus and an image processing apparatus connected to each other via a network. The information processing apparatus has an application installed thereon to give a new function to the image processing apparatus. The image processing apparatus transmits to the information processing apparatus, voice data obtained by a microphone of the image processing apparatus and data set via an operation screen customized according to the application. The information processing apparatus determines answer information indicating an action to be taken by the image processing apparatus, based on the received voice data, a dictionary owned by the application and the data set via the operation screen, and then transmits the determined answer information to the image processing apparatus. The image processing apparatus takes an action according to the answer information received therefrom.
    Type: Application
    Filed: August 20, 2010
    Publication date: March 3, 2011
    Applicant: KONICA MINOLTA BUSINESS TECHNOLOGIES, INC
    Inventors: Hideyuki MATSUDA, Kazumi SAWAYANAGI, Toshihiko OTAKE, Masao HOSONO
  • Publication number: 20110043652
    Abstract: A system and method for automatically providing content associated with captured information is described. In some examples, the system receives input by a user, and automatically provides content or links to content associated with the input. In some examples, the system receives input via text entry or by capturing text from a rendered document, such as a printed document, an object, an audio stream, and so on.
    Type: Application
    Filed: March 12, 2010
    Publication date: February 24, 2011
    Inventors: Martin T. King, Redwood Stephens, Claes-Fredrik Mannby, Jesse Peterson, Mark Sanvitale, Michael J. Smith, Christopher J. Daley-Watson
  • Publication number: 20110046966
    Abstract: A method of encoding a time-domain audio signal is presented. In the method, an electronic device receives the time-domain audio signal. The time-domain audio signal is transformed into a frequency-domain signal including a coefficient for each of a plurality of frequencies, which are grouped into frequency bands. For each frequency band, the energy of the band is determined, a scale factor for the band is determined based on the energy of the band, and the coefficients of the band are quantized based on the associated scale factor. The encoded audio signal is generated based on the quantized coefficients and the scale factors.
    Type: Application
    Filed: August 24, 2009
    Publication date: February 24, 2011
    Applicant: SLING MEDIA PVT LTD
    Inventor: Laxminarayana M. Dalimba