Modification Of At Least One Characteristic Of Speech Waves (epo) Patents (Class 704/E21.001)

E Subclasses

Speech enhancement, e.g., noise reduction, echo cancellation, etc. (epo) (Class 704/E21.002)

Time compression or expansion (epo) (Class 704/E21.017)

Suppression or repetition of time signal segments (EPO) (Class 704/E21.018)

Transformation of speech into a nonaudible representation, e.g., speech visualization, speech processing for tactile aids, etc. (epo) (Class 704/E21.019)

Synchronization of speech with image or synthesis of the lips movement from speech, e.g., for "talking heads," etc.(EPO) (Class 704/E21.02)

Electronic Devices with Voice Command and Contextual Data Processing Capabilities

Publication number: 20120330661

Abstract: An electronic device may capture a voice command from a user. The electronic device may store contextual information about the state of the electronic device when the voice command is received. The electronic device may transmit the voice command and the contextual information to computing equipment such as a desktop computer or a remote server. The computing equipment may perform a speech recognition operation on the voice command and may process the contextual information. The computing equipment may respond to the voice command. The computing equipment may also transmit information to the electronic device that allows the electronic device to respond to the voice command.

Type: Application

Filed: September 5, 2012

Publication date: December 27, 2012

Inventor: Aram M. Lindahl
VOICE DATA TRANSFERRING DEVICE, TERMINAL DEVICE, VOICE DATA TRANSFERRING METHOD, AND VOICE RECOGNITION SYSTEM

Publication number: 20120330651

Abstract: A voice data transferring device intermediates between an in-vehicle terminal and a voice recognition server. In order to check a change in voice recognition performance of the voice recognition server, the voice data transferring device performs a noise suppression processing on a voice data for evaluation in a noise suppression module; transmits the voice data for evaluation to the voice recognition server; and receives a recognition result thereof. The voice data transferring device sets a value of a noise suppression parameter used for a noise suppression processing or a value of a result integration parameter used for a processing of integrating a plurality of recognition results acquired from the voice recognition server, at an optimum value, based on the recognition result of the voice recognition server. This makes it possible to set a suitable parameter even if the voice recognition performance of the voice recognition server changes.

Type: Application

Filed: June 22, 2012

Publication date: December 27, 2012

Inventors: Yasunari Obuchi, Takeshi Homma
Hierarchical Audio Frequency Encoding and Decoding Method and System, Hierarchical Frequency Encoding and Decoding Method for Transient Signal

Publication number: 20120323582

Abstract: Hierarchical audio coding and decoding method and system and hierarchical audio coding and decoding method for transient signals are provided. In the present invention, by introducing a processing method for transient signal frames in the hierarchical audio coding and decoding methods, a segmented time-frequency transform is performed on the transient signal frames, and then the frequency-domain coefficients obtained by transformation are rearranged respectively within the core layer and within the extended layer, so as to perform the same subsequent coding processes, such as bit allocation, frequency-domain coefficient coding, etc., as those on the steady-state signal frames, thus enhancing the coding efficiency of the transient signal frames and improving the quality of the hierarchical audio coding and decoding.

Type: Application

Filed: January 12, 2011

Publication date: December 20, 2012

Inventors: Ke Peng, Guoming Chen, Hao Yuan, Dongping Jiang, Jiali Li
EDITING TELECOM WEB APPLICATIONS THROUGH A VOICE INTERFACE

Publication number: 20120323580

Abstract: Systems and associated methods for editing telecom web applications through a voice interface are described. Systems and methods provide for editing telecom web applications over a connection, as for example accessed via a standard phone, using speech and/or DTMF inputs. The voice based editing includes exposing an editing interface to a user for a telecom web application that is editable, dynamically generating a voice-based interface for a given user for accomplishing editing tasks, and modifying the telecom web application to reflect the editing commands entered by the user.

Type: Application

Filed: August 28, 2012

Publication date: December 20, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sheetal K. Agarwal, Arun Kumar, Priyanka Manwani
Wheelchair System Having Voice Activated Menu Navigation And Auditory Feedback

Publication number: 20120316884

Abstract: A personal mobility vehicle, such as a wheelchair system, includes an input audio transducer having an output coupled to a speech recognition system and an output audio transducer having an input coupled to a speech synthesis system. The wheelchair system further includes a control unit having a data processor and a memory. The data processor is coupled to the speech recognition system and to the speech synthesis system and is operable in response to a recognized utterance made by a user to present the user with a menu containing wheelchair system functions. The data processor is further configured in response to at least one further recognized utterance made by the user to select from the menu at least one wheelchair system function, to activate the selected function and to provide audible feedback to the user via the speech synthesis system.

Type: Application

Filed: June 4, 2012

Publication date: December 13, 2012

Inventors: Michael Rozaieski, Matthias Holenweg
Adaptive Human Computer Interface (AAHCI)

Publication number: 20120310652

Abstract: An Adaptive Human-Computer Interface (AAHCI) allows an electronic system to automatically monitor and learn from normal in-use behavior exhibited by a human user via responses generated by the supported input devices and to adjust output to the supported output devices accordingly. This Auto-Learning process is different than computer-directed training sessions and takes place as the user begins to use the device for the first time and with repeated use over time. The purpose of AHCI is to provide a user experience that is tailored to the skills, preferences, deficiencies and other personal attributes of the user automatically via machine-learned processes. This in turn provides an improved user experience that is more productive and cost efficient and that can automatically optimize itself over time with repeated use.

Type: Application

Filed: June 1, 2009

Publication date: December 6, 2012

Inventor: Daniel O'Sullivan
ELECTRONIC APPARATUS AND METHOD FOR CONTROLLING THE ELECTRONIC APPARATUS USING VOICE

Publication number: 20120303373

Abstract: An electronic apparatus includes a microphone, a processor, a motherboard, and a voice recognition microchip. The voice recognition microchip compares a voice command with a pre-stored voice command. If the voice command is identical with the pre-stored voice command, the processor outputs a control signal to the motherboard. The motherboard controls the electronic apparatus to perform an action corresponding to the control signal.

Type: Application

Filed: September 22, 2011

Publication date: November 29, 2012

Applicants: HON HAI PRECISION INDUSTRY CO., LTD., HONG FU JIN PRECISION INDUSTRY (ShenZhen) CO., LTD.

Inventors: CHUN-SHENG CHEN, HUA ZOU, FENG-LONG HE
AUDIO DECODING USING VARIABLE-LENGTH CODEBOOK APPLICATION RANGES

Publication number: 20120303375

Abstract: Provided are, among other things, systems, methods and techniques for decoding an audio signal from a frame-based bit stream. At least one frame includes processing information pertaining to the frame and entropy-encoded quantization indexes representing audio data within the frame. The processing information includes: (i) code book indexes, and (ii) code book application information specifying ranges of entropy-encoded quantization indexes to which the code books are to be applied. The entropy-encoded quantization indexes are decoded by applying the identified code books to the corresponding ranges of entropy-encoded quantization indexes.

Type: Application

Filed: August 7, 2012

Publication date: November 29, 2012

Applicant: Digital Rise Technology Co., Ltd.

Inventor: Yuli You
SYSTEM AND METHOD FOR VOICE CONTROL OF MEDICAL DEVICES

Publication number: 20120296181

Abstract: A medical device includes an insertable portion capable of being inserted into an orifice associated with a body of a patient. The insertable portion comprising an automated head unit capable of being manipulated in at least two axes of motion based at least in part on one or more control signals. The medical device further includes one or more controllers coupled to the automated head unit. In one particular embodiment, the one or more controllers generate the one or more control signals based at least in part on an input signal.

Type: Application

Filed: June 25, 2012

Publication date: November 22, 2012

Applicant: CHEETAH OMNI, LLC

Inventor: Mohammed N. Islam
ADAPTIVE CONTROLLER FOR A CONFIGURABLE AUDIO CODING SYSTEM

Publication number: 20120296656

Abstract: An adaptive controller for a configurable audio coding system comprising a fuzzy logic controller modified to use reinforcement learning to create an intelligent control system. With no knowledge of the external system into which it is placed the audio coding system, under the control of the adaptive controller, is capable of adapting its coding configuration to achieve user set performance goals.

Type: Application

Filed: May 19, 2011

Publication date: November 22, 2012

Inventor: Neil Smyth
NEURAL TRANSLATOR

Publication number: 20120290294

Abstract: A method and apparatus are provided for processing a set of communicated signals associated with a set of muscles, such as the muscles near the larynx of the person, or any other muscles the person use to achieve a desired response. The method includes the steps of attaching a single integrated sensor, for example, near the throat of the person proximate to the larynx and detecting an electrical signal through the sensor. The method further includes the steps of extracting features from the detected electrical signal and continuously transforming them into speech sounds without the need for further modulation. The method also includes comparing the extracted features to a set of prototype features and selecting a prototype feature of the set of prototype features providing a smallest relative difference.

Type: Application

Filed: July 27, 2012

Publication date: November 15, 2012

Inventors: Michael Callahan, Thomas Coleman
BIT ALLOCATING, AUDIO ENCODING AND DECODING

Publication number: 20120290307

Abstract: A bit allocating method is provided that includes determining the allocated number of bits in decimal point units based on each frequency band so that a Signal-to-Noise Ratio (SNR) of a spectrum existing in a predetermined frequency band is maximized within a range of the allowable number of bits for a given frame; and adjusting the allocated number of bits based on each frequency band.

Type: Application

Filed: May 14, 2012

Publication date: November 15, 2012

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Mi-young KIM, Anton POROV, Eun-mi OH
Stateful, Double-Buffered Dynamic Navigation Voice Prompting

Publication number: 20120284030

Abstract: A navigation system written in J2ME MIDP for a client device includes a plurality of media players each respectively comprising a buffer. A navigation program manages the state of the plurality of media players. The plurality of media players are in either one of an acquiring resources state, and a playing and de-allocating state. The use of a plurality of media players each respectively comprising a buffer overcomes the prior art in which navigation system can cut off a voice prompt because of the time-consuming tasks associated with playing a voice prompt.

Type: Application

Filed: July 16, 2012

Publication date: November 8, 2012

Inventor: Eric Wistrand
AUDIO ENCODER, AUDIO DECODER, METHOD FOR ENCODING AN AUDIO INFORMATION, METHOD FOR DECODING AN AUDIO INFORMATION AND COMPUTER PROGRAM USING A REGION-DEPENDENT ARITHMETIC CODING MAPPING RULE

Publication number: 20120278086

Abstract: An audio decoder for providing a decoded audio information includes an arithmetic decoder for providing a plurality of decoded spectral values on the basis of an arithmetically-encoded representation of the spectral values and a frequency-domain-to-time-domain converter for providing a time-domain audio representation using decoded spectral values. The arithmetic decoder is configured to select a mapping rule describing a mapping of a code value onto a symbol code in dependence on a context state. The arithmetic decoder is configured to determine a numeric current context value describing the current context state in dependence on a plurality of previously decoded spectral values and also in dependence on whether a spectral value to be decoded is in a first predetermined frequency region or in a second predetermined frequency region. An audio encoder provides an encoded audio information on the basis of an input audio information.

Type: Application

Filed: April 19, 2012

Publication date: November 1, 2012

Inventors: Guillaume Fuchs, Vignesh Subbaraman, Nikolaus Rittelbach, Markus Multrus, Marc Gayer, Patrick Warmbold, Christian Griebel, Oliver Weiss
METHOD FOR SELECTING ELEMENTS IN TEXTUAL ELECTRONIC LISTS AND FOR OPERATING COMPUTER-IMPLEMENTED PROGRAMS USING NATURAL LANGUAGE COMMANDS

Publication number: 20120278084

Abstract: A method for controlling a program by natural language allows a user to efficiently operate a computer-implemented target program through intuitive natural language commands. A list of natural language commands related to the target program is compiled. Each natural language command is stored as an element in an electronic list. Natural language commands generally consist of short sentences comprising at least a predicate (a verb) and an object (a noun). A user can filter the list of natural language commands by entering the initials of a natural language command. The user enters the first character of the first word to be filtered, followed by the first character of the second word to be filtered, and so forth. Filtering by initials very rapidly reduces the number of choices presented to a user and minimizes the number of keystrokes required to select a particular list element.

Type: Application

Filed: November 9, 2011

Publication date: November 1, 2012

Inventor: Michael Rabben
INTERACTIVE TOYS AND A METHOD OF SYNCHRONIZING OPERATION THEREOF

Publication number: 20120271638

Abstract: Individual audio tracks (20-24) for interactive reproduction at a remote toy (104) are each encoded with a sub-audible tone (12-18) or code that uniquely identifies the track with audible output and/or functional operation of the toy (104). Detection of the sub-audible tone at the interactive toy opens the audio path and permits related motor control in the toy, whereas absence of a relevant sub-audible tone disables at least the audio and, preferably, both the toy's speaker (122) and at least one controllable motor (130, 132). The sub-audible tone (12-18) or code is inserted for the duration of activity only and may come into and out of existence as a specific character track (amongst the plurality of individual audio tracks) moves between active and inactive phases.

Type: Application

Filed: October 5, 2010

Publication date: October 25, 2012

Applicant: RB CONCEPTS Ltd.

Inventor: Jason Regler
ESTABLISHING A MULTIMODAL ADVERTISING PERSONALITY FOR A SPONSOR OF A MULTIMODAL APPLICATION

Publication number: 20120271642

Abstract: Establishing a multimodal advertising personality for a sponsor of a multimodal application, including associating one or more vocal demeanors with a sponsor of a multimodal application and presenting a speech portion of the multimodal application for the sponsor using at least one of the vocal demeanors associated with the sponsor.

Type: Application

Filed: June 28, 2012

Publication date: October 25, 2012

Applicant: Nuance Communications, Inc.

Inventors: Charles W. Cross, JR., Hilary A. Pike
Implicit Association and Polymorphism Driven Human Machine Interaction

Publication number: 20120271640

Abstract: A voice based user-system interaction may take advantage of implicit association and/or polymorphism to achieve smooth and effective discoursing between the user and the voice enabled system. This user-system interaction may occur at a local control unit, at a remote server, or both.

Type: Application

Filed: October 17, 2011

Publication date: October 25, 2012

Inventor: Otman A. Basir
DERIVING GEOGRAPHIC DISTRIBUTION OF PHYSIOLOGICAL OR PSYCHOLOGICAL CONDITIONS OF HUMAN SPEAKERS WHILE PRESERVING PERSONAL PRIVACY

Publication number: 20120271637

Abstract: A method including: obtaining, via a plurality of communication devices, a plurality of speech signals respectively associated with human speakers, the speech signals including verbal components and non-verbal components; identifying a plurality of geographical locations, each geographic location associated with a respective one of the plurality of the communication devices; extracting the non-verbal components from the obtained speech signals; deducing physiological or psychological conditions of the human speakers by analyzing, over a specified period, the extracted non-verbal components, using predefined relations between characteristics of the non-verbal components and physiological or psychological conditions of the human speakers; and providing a geographical distribution of the deduced physiological or psychological conditions of the human speakers by associating the deduced physiological or psychological conditions of the human speakers with geographical locations thereof.

Type: Application

Filed: April 30, 2012

Publication date: October 25, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Slava Shechtman, Raphael Steinberg
INFERRING SWITCHING CONDITIONS FOR SWITCHING BETWEEN MODALITIES IN A SPEECH APPLICATION ENVIRONMENT EXTENDED FOR INTERACTIVE TEXT EXCHANGES

Publication number: 20120271643

Abstract: The disclosed solution includes a method for dynamically switching modalities based upon inferred conditions in a dialogue session involving a speech application. The method establishes a dialogue session between a user and the speech application. During the dialogue session, the user interacts using an original modality and a second modality. The speech application interacts using a speech modality only. A set of conditions indicative of interaction problems using the original modality can be inferred. Responsive to the inferring step, the original modality can be changed to the second modality. A modality transition to the second modality can be transparent the speech application and can occur without interrupting the dialogue session. The original modality and the second modality can be different modalities; one including a text exchange modality and another including a speech modality.

Type: Application

Filed: July 6, 2012

Publication date: October 25, 2012

Applicant: Nuance Communications, Inc.

Inventors: William V. DA PALMA, Baiju D. MANDALIA, Victor S. MOORE, Wendi L. NUSBICKEL
SYSTEMS AND METHODS FOR RECONSTRUCTION OF A SMOOTH SPEECH SIGNAL FROM A STUTTERED SPEECH SIGNAL

Publication number: 20120265537

Abstract: Described herein are methods, systems, apparatuses and products for reconstruction of a smooth speech signal from a stuttered speech signal. One aspect provides for accessing a stored speech signal having stuttering; identifying at least one stuttered region in the stored speech signal; modifying the at least one stuttered region in the stored speech signal; and responsive to modifying the at least one stuttered region, reconstructing a smooth speech signal corresponding to the stored speech signal. Other embodiments are disclosed.

Type: Application

Filed: April 18, 2011

Publication date: October 18, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Om Dadaji Deshmukh, Suraj Satishkumar Sheth, Ashish Verma
Method And Apparatus Of Visual Feedback For Latency In Communication Media

Publication number: 20120265524

Abstract: A method and apparatus are provided for visualizing the latency in a conversation between a local speaker and at least one remote speaker separated from the local speaker by a communication medium. A latency estimate is obtained. A timing indication of at least the end of a conversational turn by the local speaker is obtained, and an outbound graphic is displayed, indicating the progress of at least the end-of-turn across the communication medium toward the remote speaker. The outbound graphical indication is displayed with a transit time across the medium that is derived from the latency estimate. An inbound graphic is displayed, indicating the progress across the communication medium toward the local speaker, of a start of a conversational turn by the remote speaker, which is imputed to begin when the remote speaker receives the local speaker's end-of-turn. The inbound graphical indication is displayed with a transit time across the medium that is derived from the latency estimate.

Type: Application

Filed: April 12, 2011

Publication date: October 18, 2012

Applicant: Alcatel-Lucent USA Inc.

Inventor: James W. McGowan
FRAME ERASURE CONCEALMENT FOR A MULTI RATE SPEECH AND AUDIO CODEC

Publication number: 20120265523

Abstract: An audio coding terminal and method is provided. The terminal includes a coding mode setting unit to set an operation mode, from plural operation modes, for input audio coding by a codec, configured to code the input audio based on the set operation mode such that when the set operation mode is a high frame erasure rate (FER) mode the codec codes a current frame of the input audio according to a select frame erasure concealment (FEC) mode of one or more FEC modes. Upon the setting of the operation mode to be the High FER mode, the one FEC mode is selected, from the one or more FEC modes predetermined for the High FER mode, to control the codec by incorporating of redundancy within a coding of the input audio or as separate redundancy information separate from the coded input audio according to the selected one FEC mode.

Type: Application

Filed: April 10, 2012

Publication date: October 18, 2012

Applicant: Samsung Electronics Co., LTD.

Inventors: Steven Craig GREER, Hosang Sung
GESTURE-ACTIVATED INPUT USING AUDIO RECOGNITION

Publication number: 20120260177

Abstract: In one example, a method includes, displaying, at a presence-sensitive screen of a computing device, an input field in a region of a graphical user interface (GUI). The method further includes receiving, at the presence-sensitive screen, user input including one or more gestures to select the input field, wherein the one or more gestures to select the input field include motion at a location of the presence-sensitive screen that corresponds to the region of the GUI displaying the input field. The method also includes, while the input field is selected, detecting, by the computing device, an audio signal and identifying, by the computing device, at least one input value based on the detected audio signal. The method also includes assigning, by the computing device, the at least one input value to the input field in the GUI.

Type: Application

Filed: September 30, 2011

Publication date: October 11, 2012

Applicant: Google Inc.

Inventor: Trevor Sehrer
METHODS AND SYSTEM OF VOICE CONTROL

Publication number: 20120253824

Abstract: This invention relates to a system with different modes of operation or performance that integrates all the key components for the control of most domestic services, such as telephone, lighting and audio/video system, through audio inputs such as words or phrases by a user. The system includes a master unit that coordinates the total operation and communication with other technologies and/or with peripheral units. The system integrates a general output unit for controlling turning on and off of lights, motors, etc., an infrared unit for controlling audio and video system, a DAA unit for interaction with the Public Switched Telephone Network, a speaker phone unit, a serial communication port, a microphone, a speaker, among other accessories required for interaction with the user. The present invention also provides two methods which describe the operation of the system disclosed in this document, to increase functionality and versatility of this system compared to the prior art.

Type: Application

Filed: September 29, 2010

Publication date: October 4, 2012

Inventor: Magno Alcantara Talavera
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM

Publication number: 20120253818

Abstract: There is provided an information processing apparatus including an operation information transmitting unit transmitting operation information for operating respective appliances out of a plurality of appliances connected via a network, a character processing unit carrying out processing relating to characters, which correspond to the respective appliances and have individual personalities, and changes a content represented by the characters in accordance with the operation information for operating the appliances, and a display processing unit carrying out processing that displays the characters on a display unit.

Type: Application

Filed: February 23, 2012

Publication date: October 4, 2012

Applicant: Sony Corporation

Inventor: Shigeru OWADA
ELECTRONIC DEVICE AND EXTERNAL EQUIPMENT WITH DIGITAL NOISE CANCELLATION AND DIGITAL AUDIO PATH

Publication number: 20120250913

Abstract: Electronic devices and accessories are provided that may communicate over wired communications paths. The electronic devices may be portable electronic devices such as cellular telephones or media players and may have audio connectors such as 3.5 mm audio jacks. The accessories may be headsets or other equipment having mating 3.5 mm audio plugs and speakers for playing audio. Microphones may be included in an accessory to gather voice signals and noise cancellation signals. Analog-to-digital converter circuitry in the accessory may digitize the microphone signals. Digital voice signals and voice noise cancellation signals can be transmitted over the communications path and processed by audio digital signal processor circuitry in an electronic device. Digital-to-analog converter circuitry in the accessory may convert digital audio signals to analog speaker signals.

Type: Application

Filed: June 12, 2012

Publication date: October 4, 2012

Inventors: Wendell B. Sander, Jeffrey J. Terlizzi, Brian Sander, David Tupman, Barry Corlett
ADVANCED ENCODING OF MUSIC FILES

Publication number: 20120253827

Abstract: Example embodiments allow for the creation, distribution, and use of flexible media formats. Example embodiments may allow individual content files to be rendered in multiple formats and versions. In addition, example embodiments may provide for granular rights management, which may allow users to access content files on a feature-by-feature basis.

Type: Application

Filed: June 13, 2012

Publication date: October 4, 2012

Applicant: Universal Music Group, Inc.

Inventors: Howard Soroka, Christopher Horton
Mixed-mode interaction

Publication number: 20120253820

Abstract: A user of a wireless device, such as a mobile phone, can make purchases or obtain information via a network, such as the Internet, using both voice and non-verbal methods. Users can submit voice queries and receive non-verbal replies, submit non-verbal queries and receive voice replies, or perform similar operations that many the voice and data capabilities of modern mobile communication devices. The user may provide notification criteria indicating under what conditions a notification should be sent to the user's wireless device. When purchasing opportunities matching the selected notification criteria become available, the user is notified. The user can respond to the notification, and immediately take advantage of the purchasing opportunity if he so desires. Mixed-mode interactions can also be used by sellers to more advantageously control the marketing of distressed, time sensitive, or other merchandise/services.

Type: Application

Filed: November 6, 2011

Publication date: October 4, 2012

Applicant: AERITAS, LLC (F/K/A PROPEL TECHNOLOGY TEAM, LLC)

Inventors: Malik Mamdani, Patrick Johnson, Kevin Bomar
RELEVANCY RECOGNITION FOR CONTEXTUAL QUESTION ANSWERING

Publication number: 20120253825

Abstract: Disclosed are systems, methods and computer-readable media for controlling a computing device to provide contextual responses to user inputs. The method comprises receiving a user input, generating a set of features characterizing an association between the user input and a conversation context based on at least a semantic and syntactic analysis of user inputs and system responses, determining with a data-driven machine learning approach whether the user input begins a new topic or is associated with a previous conversation context and if the received question is associated with the existing topic, then generating a response to the user input using information associated with the user input and any previous user input associated with the existing topic.

Type: Application

Filed: June 15, 2012

Publication date: October 4, 2012

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Giuseppe Di Fabbrizio, Junlan Feng
MULTIMODAL REMOTE CONTROL

Publication number: 20120239396

Abstract: A method and system for operating a remotely controlled device may use multimodal remote control commands that include a gesture command and a speech command. The gesture command may be interpreted from a gesture performed by a user, while the speech command may be interpreted from speech utterances made by the user. The gesture and speech utterances may be simultaneously received by the remotely controlled device in response to displaying a user interface configured to receive multimodal commands.

Type: Application

Filed: March 15, 2011

Publication date: September 20, 2012

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Michael James Johnston, Marcelo Worsley
METHODS AND SYSTEMS FOR BIT ALLOCATION AND PARTITIONING IN GAIN-SHAPE VECTOR QUANTIZATION FOR AUDIO CODING

Publication number: 20120232913

Abstract: Embodiments are generally directed to systems and methods for bit allocation and band partitioning for gain-shape vector quantization in an audio codec. An audio codec implements a method that uses an implicit, dynamic scheme to allow an encoder and decoder to recreate a series of bit allocation decisions for gain and shape without transmitting additional side information for each decision, based on the number of bits that are left remaining and available in a given packet. For implementation in practical codecs, the band comprising the allocation of bits for the shape is recursively split into equal partitions until the number of bits allocated to each partition is less than the maximum codebook size.

Type: Application

Filed: March 7, 2012

Publication date: September 13, 2012

Inventors: Timothy B. Terriberry, Jean-Marc Valin
KITCHEN AND/OR DOMESTIC APPLIANCE

Publication number: 20120232903

Abstract: The invention relates to a kitchen and/or domestic appliance comprising input means, which are connected to a voice-recognition system, for acoustic operator commands. The invention is characterised in that means for executing command-dependent actions are provided and that the voice-recognition system is used to identify and check the authorisation of a user.

Type: Application

Filed: March 15, 2012

Publication date: September 13, 2012

Applicant: ELECTROLUX PROFESSIONAL SPA

Inventors: Claudio Cenedese, Dragan Raus, Omero Tuzzi, Maurizio Ugel, Ennio Pippia
METHOD AND APPARATUS FOR PERFORMING PACKET LOSS OR FRAME ERASURE CONCEALMENT

Publication number: 20120232889

Abstract: A method for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder receives encoded frames of compressed speech information transmitted from an encoder. The method determines whether an encoded frame has been lost, corrupted in transmission, or erased, synthesizes properly received frames, and decides on an overlap-add window to use in combining a portion of the synthesized speech signal with a subsequent speech signal resulting from a received and decoded packet, where the size of the overlap-add window is based on the unavailability of packets. If it is determined that an encoded frame has been lost, corrupted in transmission, or erased, the method performed an overlap-add operation on the portion of the synthesized speech signal and the subsequent speech signal, using the decided-on overlap-add window.

Type: Application

Filed: May 21, 2012

Publication date: September 13, 2012

Applicant: AT&T Corp.

Inventor: David A. Kapilow
TELEVISION APPARATUS AND A REMOTE OPERATION APPARATUS

Publication number: 20120226502

Abstract: According to one embodiment, a television apparatus includes a speech input unit, an indication input unit, a speech recognition unit, and a control unit. The speech input unit is configured to input a speech. The indication input unit is configured to input an indication to start speech recognition from a user. The speech recognition unit is configured to recognize the user's speech inputted after the indication is inputted. The control unit is configured to execute an operation command corresponding to a recognition result of the user's speech. The control unit, if a volume of the television apparatus at a timing when the indication is inputted is larger than or equal to a threshold, temporarily sets the volume to a value smaller than the threshold while the speech recognition unit is recognizing.

Type: Application

Filed: September 19, 2011

Publication date: September 6, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Kazushige Ouchi, Akinori Kawamura, Masaru Sakai, Kaoru Suzuki, Yusuke Kida
INFORMATION PROCESSING APPARATUS AND METHOD

Publication number: 20120226503

Abstract: An information processing apparatus comprising an information output unit configured to switch a plurality of languages at each given time interval while output a guidance information set by the plurality of languages, a response detection unit configured to detect a response to the guidance information when the guidance information is output while the languages are switched and a processing language determination unit configured to take the language which detect the response to the guidance information as a processing language.

Type: Application

Filed: February 16, 2012

Publication date: September 6, 2012

Applicant: TOSHIBA TEC KABUSHIKI KAISHA

Inventors: Masahito Sano, Kiyomitu Yamaguchi, Koji Kurosawa
SIGNAL PROCESSING METHOD, DEVICE, AND SYSTEM

Publication number: 20120215541

Abstract: A signal identifying method includes obtaining signal characteristics of a current frame of input signals; deciding, according to the signal characteristics of the current frame and updated signal characteristics of a background signal frame before the current frame, whether the current frame is a background signal frame; detecting whether the current frame serving as a background signal frame is in a first type signal state; and adjusting a signal classification decision threshold according to whether the current frame serving as a background signal frame is in the first type signal state to enhance the speech signal identification capability.

Type: Application

Filed: April 27, 2012

Publication date: August 23, 2012

Applicant: Huawei Technologies Co., Ltd.

Inventors: Yuanyuan Liu, Zhe Wang, Eyal Shlomot
COMPUTERIZED INFORMATION PRESENTATION APPARATUS

Publication number: 20120215544

Abstract: A computerized information apparatus useful for providing directions and other information to a user. In one embodiment, the apparatus comprises a processor and network interface and computer readable medium having at least one computer program disposed thereon, the at least one program being configured to receive a speech input from the user regarding an organization or entities, and provide a graphic or visual representation of the organization or entity to aid them in finding the organization or entity. At least a portion of the information is obtained via the network interface from a remote server.

Type: Application

Filed: February 24, 2012

Publication date: August 23, 2012

Inventor: Robert F. Gazdzinski
METHOD AND SYSTEM FOR SPEECH ENHANCEMENT IN A ROOM

Publication number: 20120215530

Abstract: A method of speech enhancement in a room (10), having the steps of: determining acoustic parameters of the room and a loudspeaker arrangement (24) located in the room, capturing audio signals from a speaker's voice with a microphone (12), and processing the captured audio signals with an audio signal processing unit (20). The audio signals are filtered by applying a selected frequency response curve to the audio signals, generating sound according to the processed audio signals by the loudspeaker arrangement, determining a value indicative of the overall gain applied to the captured audio signals, and selecting a frequency response curve to be applied to the captured audio signals according to the overall gain value and the acoustic parameters.

Type: Application

Filed: October 27, 2009

Publication date: August 23, 2012

Applicant: PHONAK AG

Inventor: Samuel Harsch
MOBILE COMMUNICATION TERMINAL APPARATUS AND METHOD FOR EXECUTING APPLICATION THROUGH VOICE RECOGNITION

Publication number: 20120209608

Abstract: A mobile communication terminal apparatus and method are capable of recognizing an input voice of a user and executing an application related to the recognized voice. The apparatus includes a voice input unit to receive a first input voice; a voice recognition unit to acquire first voice instruction information based on the first input voice; a voice control table acquiring unit to acquire a first voice control table comprising the first voice instruction information and first icon position information; and an application execution unit to execute a first application based on the first icon position information included in the first voice control table. The method for registering voice instruction information includes acquiring voice instruction information for a selected application; acquiring execution information of the selected application; generating a voice control table comprising the execution information, and the voice instruction information; and storing the voice control table.

Type: Application

Filed: September 29, 2011

Publication date: August 16, 2012

Applicant: PANTECH CO., LTD.

Inventor: Chang-Dae LEE
COMPUTERIZED INFORMATION PRESENTATION METHODS

Publication number: 20120209700

Abstract: Methods for providing information useful to a user of a remote computerized apparatus. In one embodiment, the method includes receiving via a network link a digitized speech input relating to an organization or entity which a user wishes to locate; based at least in part on the input, identifying a location associated with the organization or entity; and selecting and causing provision of a graphical or visual representation of the location via the network link, the graphical or visual representation of the location also comprising a graphical or visual representation of the surroundings of the organization or entity.

Type: Application

Filed: February 24, 2012

Publication date: August 16, 2012

Inventor: Robert F. Gazdzinski
ACTIVATING FUNCTIONS IN PROCESSING DEVICES USING START CODES EMBEDDED IN AUDIO

Publication number: 20120203559

Abstract: Apparatus, system and method for performing an action such as accessing supplementary data and/or executing software on a device capable of receiving multimedia are disclosed. After multimedia is received, a monitoring code is detected and a signature is extracted in response thereto from an audio portion of the multimedia. The ancillary code includes a plurality of code symbols arranged in a plurality of layers in a predetermined time period, and the signature is extracted from features of the audio of the multimedia. Supplementary data is accessed and/or software is executed using the detected code and/or signature.

Type: Application

Filed: December 30, 2011

Publication date: August 9, 2012

Applicant: ARBITRON, INC.

Inventors: WILLIAM John MCKENNA, JOHN STAVROPOULOS, ALAN NEUHAUSER, JASON BOLLES, JOHN KELLY, WENDELL LYNCH
COMPREHENSIVE MULTIPLE FEATURE TELEMATICS SYSTEM

Publication number: 20120203557

Abstract: A comprehensive system and method for telematics including the following features individually or in sub-combinations: vehicle user interfaces, telecommunications, speech recognition, digital commerce and vehicle parking, digital signal processing, wireless transmission of digitized voice input, navigational assistance for motorists, data communication to vehicles, mobile client-server communication, extending coverage and bandwidth of wireless communication services, and noise reduction.

Type: Application

Filed: April 10, 2012

Publication date: August 9, 2012

Inventor: Gilad Odinak
Alarm method and system based on voice events, and building method on behavior trajectory thereof

Patent number: 8237571

Abstract: Disclosed are an alarm method and system based on voice events, and a building method on behavior trajectory thereof. The system comprises a signal sensor, a voice-event detector and notice and alarm element. In the method, voice signals are captured from a remote unit in an environment. The captured voice signals are classified into at least a voice event. As such, an emergent-event notice is automatically transmitted out if one of predefined emergent events is detected. In the building method on behavior trajectory, messages on voice events are continuously recorded. When the number of the recorded voice events reaches a threshold, a behavior trajectory is constructed, in which a behavior consists of two or more voice events or a single voice event.

Type: Grant

Filed: February 6, 2009

Date of Patent: August 7, 2012

Assignee: Industrial Technology Research Institute

Inventors: Yuh-Ching Wang, Yu-Hsien Chiu, Gwo Lang Yan
Method and Apparatus for Voice Controlled Operation of a Media Player

Publication number: 20120191461

Abstract: A system and methods for voice controlled operation of a media player are provided. In one embodiment, a method includes detecting user positioning of a microphone power switch to an off position, detecting user positioning of the microphone power switch to an on position within a predetermined period of time and entering a voice recognition mode, by the media player, based on the user positioning of the microphone power switch to the on position within the predetermined period of time. The method may further include detecting one or more output signals of the microphone, detecting a voice command based on the one or more output signals of the microphone, and controlling operation of the media player based on the voice command, wherein the media player outputs a graphical display associated with the voice command.

Type: Application

Filed: January 6, 2010

Publication date: July 26, 2012

Applicant: Zoran Corporation

Inventors: Gaile Lin, Yulong Chen, Hong Guan, Jing Wei Wang
IMPROVED CODING/DECODING OF DIGITAL AUDIO SIGNALS

Publication number: 20120185255

Abstract: A method of hierarchical coding of a digital audio frequency input signal into several frequency sub-bands, including a core coding of the input signal according to a first throughput and at least one enhancement coding of higher throughput, of a residual signal. The core coding uses a binary allocation according to an energy criterion. The method includes for the enhancement coding: calculating a frequency-based masking threshold for at least part of the frequency bands processed by the enhancement coding; determining a perceptual importance per frequency sub-band as a function of the masking threshold and as a function of the number of bits allocated for the core coding; binary allocation of bits in the frequency sub-bands processed by the enhancement coding, as a function of the perceptual importance determined; and coding the residual signal according to the bit allocation. Also provided are a decoding method, a coder and a decoder.

Type: Application

Filed: June 25, 2010

Publication date: July 19, 2012

Applicant: FRANCE TELECOM

Inventors: David Virette, Stéphane Ragot, Balazs Kovesi, Pierre Berthet
SPEECH FEATURE EXTRACTION APPARATUS, SPEECH FEATURE EXTRACTION METHOD, AND SPEECH FEATURE EXTRACTION PROGRAM

Publication number: 20120185243

Abstract: A speech feature extraction apparatus, speech feature extraction method, and speech feature extraction program. A speech feature extraction apparatus includes: first difference calculation module to: (i) receive, as an input, a spectrum of a speech signal segmented into frames for each frequency bin; and (ii) calculate a delta spectrum for each of the frame, where the delta spectrum is a difference of the spectrum within continuous frames for the frequency bin; and first normalization module to normalize the delta spectrum of the frame for the frequency bin by dividing the delta spectrum by a function of an average spectrum; where the average spectrum is an average of spectra through all frames that are overall speech for the frequency bin; and where an output of the first normalization module is defined as a first delta feature.

Type: Application

Filed: July 10, 2010

Publication date: July 19, 2012

Applicant: International Business Machines Corp.

Inventors: Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura
INTERACTIVE FIGURINE IN A COMMUNICATIONS SYSTEM INCORPORATING SELECTIVE CONTENT DELIVERY

Publication number: 20120185254

Abstract: In a system, an interactive figurine delivers messages to a user in one of a number of forms. A server operation system includes processing capability which may individually couple content or may customize messages to a particular user of the interactive figurines. The interactive figurine contains an embedded circuit consisting of a receiver comprising a detector circuit tuned to at least one preselected frequency, a decoder to provide information indicative of intelligence and signals sent to the receiver, and a decoder circuit to provide actionable output signals indicative of information transmitted to the receiver. The server operation system may include a subscriber database and administration routines for customizing of messages and for directing messages. A user station intermediate the interactive figurine and the server module may be used to provide parental control or other control.

Type: Application

Filed: January 18, 2012

Publication date: July 19, 2012

Inventors: William A. Biehler, Gary W. Smith
SYNTHETIC SIMULATION OF A MEDIA RECORDING

Publication number: 20120174737

Abstract: A method and system for generating a synthetic simulation of a media recording is disclosed. One embodiment accesses a sound reference archive and heuristically creates a new sound that is matched against at least one sound in the sound reference archive. The media recording is analyzed and a synthetic sound based on the analyzing of the media recording is generated.

Type: Application

Filed: January 6, 2012

Publication date: July 12, 2012

Inventor: Hank Risan
SPEECH INTERACTIVE APPARATUS AND COMPUTER PROGRAM PRODUCT

Publication number: 20120179473

Abstract: According to an embodiment, a speech interactive apparatus includes an output unit to output a first response; a receiving unit to receive a start instruction of a speech input as a reply to the first response; a response control unit to stop the output of the first response when the start instruction is received while the first response is being output; and a deciding unit to decide on a first determination period, which is used in determining whether a silent state has occurred, based on whether the start instruction is received while the first response is being output or based on the timing of receiving the start instruction. When the input speech is not input during a period starting from the reception of the start instruction till an elapse of the first determination period, the response control unit instructs the output unit to output the first response again.

Type: Application

Filed: March 19, 2012

Publication date: July 12, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Takehide Yano

prev 1 2 3 4 5 6 7 8 … next