Modification Of At Least One Characteristic Of Speech Waves (epo) Patents (Class 704/E21.001)
  • Publication number: 20120330661
    Abstract: An electronic device may capture a voice command from a user. The electronic device may store contextual information about the state of the electronic device when the voice command is received. The electronic device may transmit the voice command and the contextual information to computing equipment such as a desktop computer or a remote server. The computing equipment may perform a speech recognition operation on the voice command and may process the contextual information. The computing equipment may respond to the voice command. The computing equipment may also transmit information to the electronic device that allows the electronic device to respond to the voice command.
    Type: Application
    Filed: September 5, 2012
    Publication date: December 27, 2012
    Inventor: Aram M. Lindahl
  • Publication number: 20120330651
    Abstract: A voice data transferring device intermediates between an in-vehicle terminal and a voice recognition server. In order to check a change in voice recognition performance of the voice recognition server, the voice data transferring device performs a noise suppression processing on a voice data for evaluation in a noise suppression module; transmits the voice data for evaluation to the voice recognition server; and receives a recognition result thereof. The voice data transferring device sets a value of a noise suppression parameter used for a noise suppression processing or a value of a result integration parameter used for a processing of integrating a plurality of recognition results acquired from the voice recognition server, at an optimum value, based on the recognition result of the voice recognition server. This makes it possible to set a suitable parameter even if the voice recognition performance of the voice recognition server changes.
    Type: Application
    Filed: June 22, 2012
    Publication date: December 27, 2012
    Inventors: Yasunari Obuchi, Takeshi Homma
  • Publication number: 20120323582
    Abstract: Hierarchical audio coding and decoding method and system and hierarchical audio coding and decoding method for transient signals are provided. In the present invention, by introducing a processing method for transient signal frames in the hierarchical audio coding and decoding methods, a segmented time-frequency transform is performed on the transient signal frames, and then the frequency-domain coefficients obtained by transformation are rearranged respectively within the core layer and within the extended layer, so as to perform the same subsequent coding processes, such as bit allocation, frequency-domain coefficient coding, etc., as those on the steady-state signal frames, thus enhancing the coding efficiency of the transient signal frames and improving the quality of the hierarchical audio coding and decoding.
    Type: Application
    Filed: January 12, 2011
    Publication date: December 20, 2012
    Inventors: Ke Peng, Guoming Chen, Hao Yuan, Dongping Jiang, Jiali Li
  • Publication number: 20120323580
    Abstract: Systems and associated methods for editing telecom web applications through a voice interface are described. Systems and methods provide for editing telecom web applications over a connection, as for example accessed via a standard phone, using speech and/or DTMF inputs. The voice based editing includes exposing an editing interface to a user for a telecom web application that is editable, dynamically generating a voice-based interface for a given user for accomplishing editing tasks, and modifying the telecom web application to reflect the editing commands entered by the user.
    Type: Application
    Filed: August 28, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sheetal K. Agarwal, Arun Kumar, Priyanka Manwani
  • Publication number: 20120316884
    Abstract: A personal mobility vehicle, such as a wheelchair system, includes an input audio transducer having an output coupled to a speech recognition system and an output audio transducer having an input coupled to a speech synthesis system. The wheelchair system further includes a control unit having a data processor and a memory. The data processor is coupled to the speech recognition system and to the speech synthesis system and is operable in response to a recognized utterance made by a user to present the user with a menu containing wheelchair system functions. The data processor is further configured in response to at least one further recognized utterance made by the user to select from the menu at least one wheelchair system function, to activate the selected function and to provide audible feedback to the user via the speech synthesis system.
    Type: Application
    Filed: June 4, 2012
    Publication date: December 13, 2012
    Inventors: Michael Rozaieski, Matthias Holenweg
  • Publication number: 20120310652
    Abstract: An Adaptive Human-Computer Interface (AAHCI) allows an electronic system to automatically monitor and learn from normal in-use behavior exhibited by a human user via responses generated by the supported input devices and to adjust output to the supported output devices accordingly. This Auto-Learning process is different than computer-directed training sessions and takes place as the user begins to use the device for the first time and with repeated use over time. The purpose of AHCI is to provide a user experience that is tailored to the skills, preferences, deficiencies and other personal attributes of the user automatically via machine-learned processes. This in turn provides an improved user experience that is more productive and cost efficient and that can automatically optimize itself over time with repeated use.
    Type: Application
    Filed: June 1, 2009
    Publication date: December 6, 2012
    Inventor: Daniel O'Sullivan
  • Publication number: 20120303373
    Abstract: An electronic apparatus includes a microphone, a processor, a motherboard, and a voice recognition microchip. The voice recognition microchip compares a voice command with a pre-stored voice command. If the voice command is identical with the pre-stored voice command, the processor outputs a control signal to the motherboard. The motherboard controls the electronic apparatus to perform an action corresponding to the control signal.
    Type: Application
    Filed: September 22, 2011
    Publication date: November 29, 2012
    Applicants: HON HAI PRECISION INDUSTRY CO., LTD., HONG FU JIN PRECISION INDUSTRY (ShenZhen) CO., LTD.
    Inventors: CHUN-SHENG CHEN, HUA ZOU, FENG-LONG HE
  • Publication number: 20120303375
    Abstract: Provided are, among other things, systems, methods and techniques for decoding an audio signal from a frame-based bit stream. At least one frame includes processing information pertaining to the frame and entropy-encoded quantization indexes representing audio data within the frame. The processing information includes: (i) code book indexes, and (ii) code book application information specifying ranges of entropy-encoded quantization indexes to which the code books are to be applied. The entropy-encoded quantization indexes are decoded by applying the identified code books to the corresponding ranges of entropy-encoded quantization indexes.
    Type: Application
    Filed: August 7, 2012
    Publication date: November 29, 2012
    Applicant: Digital Rise Technology Co., Ltd.
    Inventor: Yuli You
  • Publication number: 20120296181
    Abstract: A medical device includes an insertable portion capable of being inserted into an orifice associated with a body of a patient. The insertable portion comprising an automated head unit capable of being manipulated in at least two axes of motion based at least in part on one or more control signals. The medical device further includes one or more controllers coupled to the automated head unit. In one particular embodiment, the one or more controllers generate the one or more control signals based at least in part on an input signal.
    Type: Application
    Filed: June 25, 2012
    Publication date: November 22, 2012
    Applicant: CHEETAH OMNI, LLC
    Inventor: Mohammed N. Islam
  • Publication number: 20120296656
    Abstract: An adaptive controller for a configurable audio coding system comprising a fuzzy logic controller modified to use reinforcement learning to create an intelligent control system. With no knowledge of the external system into which it is placed the audio coding system, under the control of the adaptive controller, is capable of adapting its coding configuration to achieve user set performance goals.
    Type: Application
    Filed: May 19, 2011
    Publication date: November 22, 2012
    Inventor: Neil Smyth
  • Publication number: 20120290294
    Abstract: A method and apparatus are provided for processing a set of communicated signals associated with a set of muscles, such as the muscles near the larynx of the person, or any other muscles the person use to achieve a desired response. The method includes the steps of attaching a single integrated sensor, for example, near the throat of the person proximate to the larynx and detecting an electrical signal through the sensor. The method further includes the steps of extracting features from the detected electrical signal and continuously transforming them into speech sounds without the need for further modulation. The method also includes comparing the extracted features to a set of prototype features and selecting a prototype feature of the set of prototype features providing a smallest relative difference.
    Type: Application
    Filed: July 27, 2012
    Publication date: November 15, 2012
    Inventors: Michael Callahan, Thomas Coleman
  • Publication number: 20120290307
    Abstract: A bit allocating method is provided that includes determining the allocated number of bits in decimal point units based on each frequency band so that a Signal-to-Noise Ratio (SNR) of a spectrum existing in a predetermined frequency band is maximized within a range of the allowable number of bits for a given frame; and adjusting the allocated number of bits based on each frequency band.
    Type: Application
    Filed: May 14, 2012
    Publication date: November 15, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Mi-young KIM, Anton POROV, Eun-mi OH
  • Publication number: 20120284030
    Abstract: A navigation system written in J2ME MIDP for a client device includes a plurality of media players each respectively comprising a buffer. A navigation program manages the state of the plurality of media players. The plurality of media players are in either one of an acquiring resources state, and a playing and de-allocating state. The use of a plurality of media players each respectively comprising a buffer overcomes the prior art in which navigation system can cut off a voice prompt because of the time-consuming tasks associated with playing a voice prompt.
    Type: Application
    Filed: July 16, 2012
    Publication date: November 8, 2012
    Inventor: Eric Wistrand
  • Publication number: 20120278086
    Abstract: An audio decoder for providing a decoded audio information includes an arithmetic decoder for providing a plurality of decoded spectral values on the basis of an arithmetically-encoded representation of the spectral values and a frequency-domain-to-time-domain converter for providing a time-domain audio representation using decoded spectral values. The arithmetic decoder is configured to select a mapping rule describing a mapping of a code value onto a symbol code in dependence on a context state. The arithmetic decoder is configured to determine a numeric current context value describing the current context state in dependence on a plurality of previously decoded spectral values and also in dependence on whether a spectral value to be decoded is in a first predetermined frequency region or in a second predetermined frequency region. An audio encoder provides an encoded audio information on the basis of an input audio information.
    Type: Application
    Filed: April 19, 2012
    Publication date: November 1, 2012
    Inventors: Guillaume Fuchs, Vignesh Subbaraman, Nikolaus Rittelbach, Markus Multrus, Marc Gayer, Patrick Warmbold, Christian Griebel, Oliver Weiss
  • Publication number: 20120278084
    Abstract: A method for controlling a program by natural language allows a user to efficiently operate a computer-implemented target program through intuitive natural language commands. A list of natural language commands related to the target program is compiled. Each natural language command is stored as an element in an electronic list. Natural language commands generally consist of short sentences comprising at least a predicate (a verb) and an object (a noun). A user can filter the list of natural language commands by entering the initials of a natural language command. The user enters the first character of the first word to be filtered, followed by the first character of the second word to be filtered, and so forth. Filtering by initials very rapidly reduces the number of choices presented to a user and minimizes the number of keystrokes required to select a particular list element.
    Type: Application
    Filed: November 9, 2011
    Publication date: November 1, 2012
    Inventor: Michael Rabben
  • Publication number: 20120271638
    Abstract: Individual audio tracks (20-24) for interactive reproduction at a remote toy (104) are each encoded with a sub-audible tone (12-18) or code that uniquely identifies the track with audible output and/or functional operation of the toy (104). Detection of the sub-audible tone at the interactive toy opens the audio path and permits related motor control in the toy, whereas absence of a relevant sub-audible tone disables at least the audio and, preferably, both the toy's speaker (122) and at least one controllable motor (130, 132). The sub-audible tone (12-18) or code is inserted for the duration of activity only and may come into and out of existence as a specific character track (amongst the plurality of individual audio tracks) moves between active and inactive phases.
    Type: Application
    Filed: October 5, 2010
    Publication date: October 25, 2012
    Applicant: RB CONCEPTS Ltd.
    Inventor: Jason Regler
  • Publication number: 20120271642
    Abstract: Establishing a multimodal advertising personality for a sponsor of a multimodal application, including associating one or more vocal demeanors with a sponsor of a multimodal application and presenting a speech portion of the multimodal application for the sponsor using at least one of the vocal demeanors associated with the sponsor.
    Type: Application
    Filed: June 28, 2012
    Publication date: October 25, 2012
    Applicant: Nuance Communications, Inc.
    Inventors: Charles W. Cross, JR., Hilary A. Pike
  • Publication number: 20120271640
    Abstract: A voice based user-system interaction may take advantage of implicit association and/or polymorphism to achieve smooth and effective discoursing between the user and the voice enabled system. This user-system interaction may occur at a local control unit, at a remote server, or both.
    Type: Application
    Filed: October 17, 2011
    Publication date: October 25, 2012
    Inventor: Otman A. Basir
  • Publication number: 20120271637
    Abstract: A method including: obtaining, via a plurality of communication devices, a plurality of speech signals respectively associated with human speakers, the speech signals including verbal components and non-verbal components; identifying a plurality of geographical locations, each geographic location associated with a respective one of the plurality of the communication devices; extracting the non-verbal components from the obtained speech signals; deducing physiological or psychological conditions of the human speakers by analyzing, over a specified period, the extracted non-verbal components, using predefined relations between characteristics of the non-verbal components and physiological or psychological conditions of the human speakers; and providing a geographical distribution of the deduced physiological or psychological conditions of the human speakers by associating the deduced physiological or psychological conditions of the human speakers with geographical locations thereof.
    Type: Application
    Filed: April 30, 2012
    Publication date: October 25, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Slava Shechtman, Raphael Steinberg
  • Publication number: 20120271643
    Abstract: The disclosed solution includes a method for dynamically switching modalities based upon inferred conditions in a dialogue session involving a speech application. The method establishes a dialogue session between a user and the speech application. During the dialogue session, the user interacts using an original modality and a second modality. The speech application interacts using a speech modality only. A set of conditions indicative of interaction problems using the original modality can be inferred. Responsive to the inferring step, the original modality can be changed to the second modality. A modality transition to the second modality can be transparent the speech application and can occur without interrupting the dialogue session. The original modality and the second modality can be different modalities; one including a text exchange modality and another including a speech modality.
    Type: Application
    Filed: July 6, 2012
    Publication date: October 25, 2012
    Applicant: Nuance Communications, Inc.
    Inventors: William V. DA PALMA, Baiju D. MANDALIA, Victor S. MOORE, Wendi L. NUSBICKEL
  • Publication number: 20120265537
    Abstract: Described herein are methods, systems, apparatuses and products for reconstruction of a smooth speech signal from a stuttered speech signal. One aspect provides for accessing a stored speech signal having stuttering; identifying at least one stuttered region in the stored speech signal; modifying the at least one stuttered region in the stored speech signal; and responsive to modifying the at least one stuttered region, reconstructing a smooth speech signal corresponding to the stored speech signal. Other embodiments are disclosed.
    Type: Application
    Filed: April 18, 2011
    Publication date: October 18, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Om Dadaji Deshmukh, Suraj Satishkumar Sheth, Ashish Verma
  • Publication number: 20120265524
    Abstract: A method and apparatus are provided for visualizing the latency in a conversation between a local speaker and at least one remote speaker separated from the local speaker by a communication medium. A latency estimate is obtained. A timing indication of at least the end of a conversational turn by the local speaker is obtained, and an outbound graphic is displayed, indicating the progress of at least the end-of-turn across the communication medium toward the remote speaker. The outbound graphical indication is displayed with a transit time across the medium that is derived from the latency estimate. An inbound graphic is displayed, indicating the progress across the communication medium toward the local speaker, of a start of a conversational turn by the remote speaker, which is imputed to begin when the remote speaker receives the local speaker's end-of-turn. The inbound graphical indication is displayed with a transit time across the medium that is derived from the latency estimate.
    Type: Application
    Filed: April 12, 2011
    Publication date: October 18, 2012
    Applicant: Alcatel-Lucent USA Inc.
    Inventor: James W. McGowan
  • Publication number: 20120265523
    Abstract: An audio coding terminal and method is provided. The terminal includes a coding mode setting unit to set an operation mode, from plural operation modes, for input audio coding by a codec, configured to code the input audio based on the set operation mode such that when the set operation mode is a high frame erasure rate (FER) mode the codec codes a current frame of the input audio according to a select frame erasure concealment (FEC) mode of one or more FEC modes. Upon the setting of the operation mode to be the High FER mode, the one FEC mode is selected, from the one or more FEC modes predetermined for the High FER mode, to control the codec by incorporating of redundancy within a coding of the input audio or as separate redundancy information separate from the coded input audio according to the selected one FEC mode.
    Type: Application
    Filed: April 10, 2012
    Publication date: October 18, 2012
    Applicant: Samsung Electronics Co., LTD.
    Inventors: Steven Craig GREER, Hosang Sung
  • Publication number: 20120260177
    Abstract: In one example, a method includes, displaying, at a presence-sensitive screen of a computing device, an input field in a region of a graphical user interface (GUI). The method further includes receiving, at the presence-sensitive screen, user input including one or more gestures to select the input field, wherein the one or more gestures to select the input field include motion at a location of the presence-sensitive screen that corresponds to the region of the GUI displaying the input field. The method also includes, while the input field is selected, detecting, by the computing device, an audio signal and identifying, by the computing device, at least one input value based on the detected audio signal. The method also includes assigning, by the computing device, the at least one input value to the input field in the GUI.
    Type: Application
    Filed: September 30, 2011
    Publication date: October 11, 2012
    Applicant: Google Inc.
    Inventor: Trevor Sehrer
  • Publication number: 20120253824
    Abstract: This invention relates to a system with different modes of operation or performance that integrates all the key components for the control of most domestic services, such as telephone, lighting and audio/video system, through audio inputs such as words or phrases by a user. The system includes a master unit that coordinates the total operation and communication with other technologies and/or with peripheral units. The system integrates a general output unit for controlling turning on and off of lights, motors, etc., an infrared unit for controlling audio and video system, a DAA unit for interaction with the Public Switched Telephone Network, a speaker phone unit, a serial communication port, a microphone, a speaker, among other accessories required for interaction with the user. The present invention also provides two methods which describe the operation of the system disclosed in this document, to increase functionality and versatility of this system compared to the prior art.
    Type: Application
    Filed: September 29, 2010
    Publication date: October 4, 2012
    Inventor: Magno Alcantara Talavera
  • Publication number: 20120253818
    Abstract: There is provided an information processing apparatus including an operation information transmitting unit transmitting operation information for operating respective appliances out of a plurality of appliances connected via a network, a character processing unit carrying out processing relating to characters, which correspond to the respective appliances and have individual personalities, and changes a content represented by the characters in accordance with the operation information for operating the appliances, and a display processing unit carrying out processing that displays the characters on a display unit.
    Type: Application
    Filed: February 23, 2012
    Publication date: October 4, 2012
    Applicant: Sony Corporation
    Inventor: Shigeru OWADA
  • Publication number: 20120250913
    Abstract: Electronic devices and accessories are provided that may communicate over wired communications paths. The electronic devices may be portable electronic devices such as cellular telephones or media players and may have audio connectors such as 3.5 mm audio jacks. The accessories may be headsets or other equipment having mating 3.5 mm audio plugs and speakers for playing audio. Microphones may be included in an accessory to gather voice signals and noise cancellation signals. Analog-to-digital converter circuitry in the accessory may digitize the microphone signals. Digital voice signals and voice noise cancellation signals can be transmitted over the communications path and processed by audio digital signal processor circuitry in an electronic device. Digital-to-analog converter circuitry in the accessory may convert digital audio signals to analog speaker signals.
    Type: Application
    Filed: June 12, 2012
    Publication date: October 4, 2012
    Inventors: Wendell B. Sander, Jeffrey J. Terlizzi, Brian Sander, David Tupman, Barry Corlett
  • Publication number: 20120253827
    Abstract: Example embodiments allow for the creation, distribution, and use of flexible media formats. Example embodiments may allow individual content files to be rendered in multiple formats and versions. In addition, example embodiments may provide for granular rights management, which may allow users to access content files on a feature-by-feature basis.
    Type: Application
    Filed: June 13, 2012
    Publication date: October 4, 2012
    Applicant: Universal Music Group, Inc.
    Inventors: Howard Soroka, Christopher Horton
  • Publication number: 20120253820
    Abstract: A user of a wireless device, such as a mobile phone, can make purchases or obtain information via a network, such as the Internet, using both voice and non-verbal methods. Users can submit voice queries and receive non-verbal replies, submit non-verbal queries and receive voice replies, or perform similar operations that many the voice and data capabilities of modern mobile communication devices. The user may provide notification criteria indicating under what conditions a notification should be sent to the user's wireless device. When purchasing opportunities matching the selected notification criteria become available, the user is notified. The user can respond to the notification, and immediately take advantage of the purchasing opportunity if he so desires. Mixed-mode interactions can also be used by sellers to more advantageously control the marketing of distressed, time sensitive, or other merchandise/services.
    Type: Application
    Filed: November 6, 2011
    Publication date: October 4, 2012
    Applicant: AERITAS, LLC (F/K/A PROPEL TECHNOLOGY TEAM, LLC)
    Inventors: Malik Mamdani, Patrick Johnson, Kevin Bomar
  • Publication number: 20120253825
    Abstract: Disclosed are systems, methods and computer-readable media for controlling a computing device to provide contextual responses to user inputs. The method comprises receiving a user input, generating a set of features characterizing an association between the user input and a conversation context based on at least a semantic and syntactic analysis of user inputs and system responses, determining with a data-driven machine learning approach whether the user input begins a new topic or is associated with a previous conversation context and if the received question is associated with the existing topic, then generating a response to the user input using information associated with the user input and any previous user input associated with the existing topic.
    Type: Application
    Filed: June 15, 2012
    Publication date: October 4, 2012
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Giuseppe Di Fabbrizio, Junlan Feng
  • Publication number: 20120239396
    Abstract: A method and system for operating a remotely controlled device may use multimodal remote control commands that include a gesture command and a speech command. The gesture command may be interpreted from a gesture performed by a user, while the speech command may be interpreted from speech utterances made by the user. The gesture and speech utterances may be simultaneously received by the remotely controlled device in response to displaying a user interface configured to receive multimodal commands.
    Type: Application
    Filed: March 15, 2011
    Publication date: September 20, 2012
    Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Michael James Johnston, Marcelo Worsley
  • Publication number: 20120232913
    Abstract: Embodiments are generally directed to systems and methods for bit allocation and band partitioning for gain-shape vector quantization in an audio codec. An audio codec implements a method that uses an implicit, dynamic scheme to allow an encoder and decoder to recreate a series of bit allocation decisions for gain and shape without transmitting additional side information for each decision, based on the number of bits that are left remaining and available in a given packet. For implementation in practical codecs, the band comprising the allocation of bits for the shape is recursively split into equal partitions until the number of bits allocated to each partition is less than the maximum codebook size.
    Type: Application
    Filed: March 7, 2012
    Publication date: September 13, 2012
    Inventors: Timothy B. Terriberry, Jean-Marc Valin
  • Publication number: 20120232903
    Abstract: The invention relates to a kitchen and/or domestic appliance comprising input means, which are connected to a voice-recognition system, for acoustic operator commands. The invention is characterised in that means for executing command-dependent actions are provided and that the voice-recognition system is used to identify and check the authorisation of a user.
    Type: Application
    Filed: March 15, 2012
    Publication date: September 13, 2012
    Applicant: ELECTROLUX PROFESSIONAL SPA
    Inventors: Claudio Cenedese, Dragan Raus, Omero Tuzzi, Maurizio Ugel, Ennio Pippia
  • Publication number: 20120232889
    Abstract: A method for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder receives encoded frames of compressed speech information transmitted from an encoder. The method determines whether an encoded frame has been lost, corrupted in transmission, or erased, synthesizes properly received frames, and decides on an overlap-add window to use in combining a portion of the synthesized speech signal with a subsequent speech signal resulting from a received and decoded packet, where the size of the overlap-add window is based on the unavailability of packets. If it is determined that an encoded frame has been lost, corrupted in transmission, or erased, the method performed an overlap-add operation on the portion of the synthesized speech signal and the subsequent speech signal, using the decided-on overlap-add window.
    Type: Application
    Filed: May 21, 2012
    Publication date: September 13, 2012
    Applicant: AT&T Corp.
    Inventor: David A. Kapilow
  • Publication number: 20120226502
    Abstract: According to one embodiment, a television apparatus includes a speech input unit, an indication input unit, a speech recognition unit, and a control unit. The speech input unit is configured to input a speech. The indication input unit is configured to input an indication to start speech recognition from a user. The speech recognition unit is configured to recognize the user's speech inputted after the indication is inputted. The control unit is configured to execute an operation command corresponding to a recognition result of the user's speech. The control unit, if a volume of the television apparatus at a timing when the indication is inputted is larger than or equal to a threshold, temporarily sets the volume to a value smaller than the threshold while the speech recognition unit is recognizing.
    Type: Application
    Filed: September 19, 2011
    Publication date: September 6, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kazushige Ouchi, Akinori Kawamura, Masaru Sakai, Kaoru Suzuki, Yusuke Kida
  • Publication number: 20120226503
    Abstract: An information processing apparatus comprising an information output unit configured to switch a plurality of languages at each given time interval while output a guidance information set by the plurality of languages, a response detection unit configured to detect a response to the guidance information when the guidance information is output while the languages are switched and a processing language determination unit configured to take the language which detect the response to the guidance information as a processing language.
    Type: Application
    Filed: February 16, 2012
    Publication date: September 6, 2012
    Applicant: TOSHIBA TEC KABUSHIKI KAISHA
    Inventors: Masahito Sano, Kiyomitu Yamaguchi, Koji Kurosawa
  • Publication number: 20120215541
    Abstract: A signal identifying method includes obtaining signal characteristics of a current frame of input signals; deciding, according to the signal characteristics of the current frame and updated signal characteristics of a background signal frame before the current frame, whether the current frame is a background signal frame; detecting whether the current frame serving as a background signal frame is in a first type signal state; and adjusting a signal classification decision threshold according to whether the current frame serving as a background signal frame is in the first type signal state to enhance the speech signal identification capability.
    Type: Application
    Filed: April 27, 2012
    Publication date: August 23, 2012
    Applicant: Huawei Technologies Co., Ltd.
    Inventors: Yuanyuan Liu, Zhe Wang, Eyal Shlomot
  • Publication number: 20120215544
    Abstract: A computerized information apparatus useful for providing directions and other information to a user. In one embodiment, the apparatus comprises a processor and network interface and computer readable medium having at least one computer program disposed thereon, the at least one program being configured to receive a speech input from the user regarding an organization or entities, and provide a graphic or visual representation of the organization or entity to aid them in finding the organization or entity. At least a portion of the information is obtained via the network interface from a remote server.
    Type: Application
    Filed: February 24, 2012
    Publication date: August 23, 2012
    Inventor: Robert F. Gazdzinski
  • Publication number: 20120215530
    Abstract: A method of speech enhancement in a room (10), having the steps of: determining acoustic parameters of the room and a loudspeaker arrangement (24) located in the room, capturing audio signals from a speaker's voice with a microphone (12), and processing the captured audio signals with an audio signal processing unit (20). The audio signals are filtered by applying a selected frequency response curve to the audio signals, generating sound according to the processed audio signals by the loudspeaker arrangement, determining a value indicative of the overall gain applied to the captured audio signals, and selecting a frequency response curve to be applied to the captured audio signals according to the overall gain value and the acoustic parameters.
    Type: Application
    Filed: October 27, 2009
    Publication date: August 23, 2012
    Applicant: PHONAK AG
    Inventor: Samuel Harsch
  • Publication number: 20120209608
    Abstract: A mobile communication terminal apparatus and method are capable of recognizing an input voice of a user and executing an application related to the recognized voice. The apparatus includes a voice input unit to receive a first input voice; a voice recognition unit to acquire first voice instruction information based on the first input voice; a voice control table acquiring unit to acquire a first voice control table comprising the first voice instruction information and first icon position information; and an application execution unit to execute a first application based on the first icon position information included in the first voice control table. The method for registering voice instruction information includes acquiring voice instruction information for a selected application; acquiring execution information of the selected application; generating a voice control table comprising the execution information, and the voice instruction information; and storing the voice control table.
    Type: Application
    Filed: September 29, 2011
    Publication date: August 16, 2012
    Applicant: PANTECH CO., LTD.
    Inventor: Chang-Dae LEE
  • Publication number: 20120209700
    Abstract: Methods for providing information useful to a user of a remote computerized apparatus. In one embodiment, the method includes receiving via a network link a digitized speech input relating to an organization or entity which a user wishes to locate; based at least in part on the input, identifying a location associated with the organization or entity; and selecting and causing provision of a graphical or visual representation of the location via the network link, the graphical or visual representation of the location also comprising a graphical or visual representation of the surroundings of the organization or entity.
    Type: Application
    Filed: February 24, 2012
    Publication date: August 16, 2012
    Inventor: Robert F. Gazdzinski
  • Publication number: 20120203559
    Abstract: Apparatus, system and method for performing an action such as accessing supplementary data and/or executing software on a device capable of receiving multimedia are disclosed. After multimedia is received, a monitoring code is detected and a signature is extracted in response thereto from an audio portion of the multimedia. The ancillary code includes a plurality of code symbols arranged in a plurality of layers in a predetermined time period, and the signature is extracted from features of the audio of the multimedia. Supplementary data is accessed and/or software is executed using the detected code and/or signature.
    Type: Application
    Filed: December 30, 2011
    Publication date: August 9, 2012
    Applicant: ARBITRON, INC.
    Inventors: WILLIAM John MCKENNA, JOHN STAVROPOULOS, ALAN NEUHAUSER, JASON BOLLES, JOHN KELLY, WENDELL LYNCH
  • Publication number: 20120203557
    Abstract: A comprehensive system and method for telematics including the following features individually or in sub-combinations: vehicle user interfaces, telecommunications, speech recognition, digital commerce and vehicle parking, digital signal processing, wireless transmission of digitized voice input, navigational assistance for motorists, data communication to vehicles, mobile client-server communication, extending coverage and bandwidth of wireless communication services, and noise reduction.
    Type: Application
    Filed: April 10, 2012
    Publication date: August 9, 2012
    Inventor: Gilad Odinak
  • Patent number: 8237571
    Abstract: Disclosed are an alarm method and system based on voice events, and a building method on behavior trajectory thereof. The system comprises a signal sensor, a voice-event detector and notice and alarm element. In the method, voice signals are captured from a remote unit in an environment. The captured voice signals are classified into at least a voice event. As such, an emergent-event notice is automatically transmitted out if one of predefined emergent events is detected. In the building method on behavior trajectory, messages on voice events are continuously recorded. When the number of the recorded voice events reaches a threshold, a behavior trajectory is constructed, in which a behavior consists of two or more voice events or a single voice event.
    Type: Grant
    Filed: February 6, 2009
    Date of Patent: August 7, 2012
    Assignee: Industrial Technology Research Institute
    Inventors: Yuh-Ching Wang, Yu-Hsien Chiu, Gwo Lang Yan
  • Publication number: 20120191461
    Abstract: A system and methods for voice controlled operation of a media player are provided. In one embodiment, a method includes detecting user positioning of a microphone power switch to an off position, detecting user positioning of the microphone power switch to an on position within a predetermined period of time and entering a voice recognition mode, by the media player, based on the user positioning of the microphone power switch to the on position within the predetermined period of time. The method may further include detecting one or more output signals of the microphone, detecting a voice command based on the one or more output signals of the microphone, and controlling operation of the media player based on the voice command, wherein the media player outputs a graphical display associated with the voice command.
    Type: Application
    Filed: January 6, 2010
    Publication date: July 26, 2012
    Applicant: Zoran Corporation
    Inventors: Gaile Lin, Yulong Chen, Hong Guan, Jing Wei Wang
  • Publication number: 20120185255
    Abstract: A method of hierarchical coding of a digital audio frequency input signal into several frequency sub-bands, including a core coding of the input signal according to a first throughput and at least one enhancement coding of higher throughput, of a residual signal. The core coding uses a binary allocation according to an energy criterion. The method includes for the enhancement coding: calculating a frequency-based masking threshold for at least part of the frequency bands processed by the enhancement coding; determining a perceptual importance per frequency sub-band as a function of the masking threshold and as a function of the number of bits allocated for the core coding; binary allocation of bits in the frequency sub-bands processed by the enhancement coding, as a function of the perceptual importance determined; and coding the residual signal according to the bit allocation. Also provided are a decoding method, a coder and a decoder.
    Type: Application
    Filed: June 25, 2010
    Publication date: July 19, 2012
    Applicant: FRANCE TELECOM
    Inventors: David Virette, Stéphane Ragot, Balazs Kovesi, Pierre Berthet
  • Publication number: 20120185243
    Abstract: A speech feature extraction apparatus, speech feature extraction method, and speech feature extraction program. A speech feature extraction apparatus includes: first difference calculation module to: (i) receive, as an input, a spectrum of a speech signal segmented into frames for each frequency bin; and (ii) calculate a delta spectrum for each of the frame, where the delta spectrum is a difference of the spectrum within continuous frames for the frequency bin; and first normalization module to normalize the delta spectrum of the frame for the frequency bin by dividing the delta spectrum by a function of an average spectrum; where the average spectrum is an average of spectra through all frames that are overall speech for the frequency bin; and where an output of the first normalization module is defined as a first delta feature.
    Type: Application
    Filed: July 10, 2010
    Publication date: July 19, 2012
    Applicant: International Business Machines Corp.
    Inventors: Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura
  • Publication number: 20120185254
    Abstract: In a system, an interactive figurine delivers messages to a user in one of a number of forms. A server operation system includes processing capability which may individually couple content or may customize messages to a particular user of the interactive figurines. The interactive figurine contains an embedded circuit consisting of a receiver comprising a detector circuit tuned to at least one preselected frequency, a decoder to provide information indicative of intelligence and signals sent to the receiver, and a decoder circuit to provide actionable output signals indicative of information transmitted to the receiver. The server operation system may include a subscriber database and administration routines for customizing of messages and for directing messages. A user station intermediate the interactive figurine and the server module may be used to provide parental control or other control.
    Type: Application
    Filed: January 18, 2012
    Publication date: July 19, 2012
    Inventors: William A. Biehler, Gary W. Smith
  • Publication number: 20120174737
    Abstract: A method and system for generating a synthetic simulation of a media recording is disclosed. One embodiment accesses a sound reference archive and heuristically creates a new sound that is matched against at least one sound in the sound reference archive. The media recording is analyzed and a synthetic sound based on the analyzing of the media recording is generated.
    Type: Application
    Filed: January 6, 2012
    Publication date: July 12, 2012
    Inventor: Hank Risan
  • Publication number: 20120179473
    Abstract: According to an embodiment, a speech interactive apparatus includes an output unit to output a first response; a receiving unit to receive a start instruction of a speech input as a reply to the first response; a response control unit to stop the output of the first response when the start instruction is received while the first response is being output; and a deciding unit to decide on a first determination period, which is used in determining whether a silent state has occurred, based on whether the start instruction is received while the first response is being output or based on the timing of receiving the start instruction. When the input speech is not input during a period starting from the reception of the start instruction till an elapse of the first determination period, the response control unit instructs the output unit to output the first response again.
    Type: Application
    Filed: March 19, 2012
    Publication date: July 12, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventor: Takehide Yano