Modification Of At Least One Characteristic Of Speech Waves (epo) Patents (Class 704/E21.001)

E Subclasses

Speech enhancement, e.g., noise reduction, echo cancellation, etc. (epo) (Class 704/E21.002)

Time compression or expansion (epo) (Class 704/E21.017)

Suppression or repetition of time signal segments (EPO) (Class 704/E21.018)

Transformation of speech into a nonaudible representation, e.g., speech visualization, speech processing for tactile aids, etc. (epo) (Class 704/E21.019)

Synchronization of speech with image or synthesis of the lips movement from speech, e.g., for "talking heads," etc.(EPO) (Class 704/E21.02)

NAVIGATION SYSTEM

Publication number: 20120173245

Abstract: A navigation system is provided which facilitates discrimination between an icon of a facility associated with a route, along which the user is expected to move from now on, and an ordinary icon. To achieve this, it includes a destination estimating unit for acquiring information about a driving history and for estimating a destination from the information about the driving history acquired; a drawing decision changing unit for drawing a destination candidate estimated by the destination estimating unit in a form different from an icon of a non-destination candidate; and an information display unit for causing the icon drawn by the drawing decision changing unit to be displayed.

Type: Application

Filed: December 24, 2009

Publication date: July 5, 2012

Applicant: Mitsubishi Electric Corporation

Inventors: Tadashi Miyahara, Toyoaki Kitano, Hideto Miyazaki, Tsutomu Matsubara, Kuniyo Ieda, Minoru Ozaki, Syoji Tanaka, Takashi Nakagawa, Tomohiro Shiino, Wataru Yamazaki
Systems and Methods for Transmitting Media Content via Digital Radio Broadcast Transmission for Synchronized Rendering by a Receiver

Publication number: 20120162512

Abstract: Systems, methods, and processor readable media are disclosed for encoding and transmitting first media content and second media content using a digital radio broadcast system, such that the second media content can be rendered in synchronization with the first media content by a digital radio broadcast receiver. The disclosed systems, methods, and processor-readable media determine when a receiver will render audio and data content that is transmitted at a given time by the digital radio broadcast transmitter, and adjust the media content accordingly to provide synchronized rendering. In exemplary embodiments, these adjustments can be provided by: 1) inserting timing instructions specifying playback time in the secondary content based on calculated delays; or 2) controlling the timing of sending the primary or secondary content to the transmitter so that it will be rendered in synchronization by the receiver.

Type: Application

Filed: February 21, 2012

Publication date: June 28, 2012

Applicant: iBiquity Digital Corporation

Inventors: Steven Andrew Johnson, Muthu Gopal Balasubramanian, Harvey Chalmers, Jeffrey Ranken Detweiler, Albert John Gambardella, Russell Iannuzzelli, Stephen Douglas Mattson
SYSTEM AND METHOD FOR FUNNELING USER RESPONSES IN AN INTERNET VOICE PORTAL SYSTEM TO DETERMINE A DESIRED ITEM OR SERVICEBACKGROUND OF THE INVENTION

Publication number: 20120166202

Abstract: A method of funneling user responses in a voice portal system to determine a desired item or service includes (a) querying a user for an attribute value associated with a first particular attribute of the desired item or service; and (b) determining if the attribute value given by the user satisfies an end state. If the end state is not satisfied, steps (a) and (b) are performed with a new particular attribute.

Type: Application

Filed: March 5, 2012

Publication date: June 28, 2012

Inventors: Steven Jeromy CARRIERE, Kelly James Slough, Steven Gregory Woods
Navigation System and Radio Receiving System

Publication number: 20120166204

Abstract: An object of the invention is also a navigation system having an input device for the input of an input scale value, having a display device for displaying road map information according to a selected display scale value and having a processor device, wherein the number of enterable input scale values is larger than the number of the selectable display scale values.

Type: Application

Filed: March 9, 2012

Publication date: June 28, 2012

Applicant: Bayerische Motoren Werke Aktiengesellschaft

Inventors: Karsten Knebel, Liza Hassel, Frank Wolf
SYSTEM AND METHOD FOR INTEGRATING GESTURE AND SOUND FOR CONTROLLING DEVICE

Publication number: 20120166200

Abstract: Disclosed is a system for integrating gestures and sounds including: a gesture recognition unit that extracts gesture feature information corresponding to user commands from image information and acquires gesture recognition information from the gesture feature information; a background recognition unit acquiring background sound information using the predetermined background sound model from the sound information; a sound recognition unit that extracts the sound feature information corresponding to user commands from the sound information and extracts the sound feature information based on the background sound information and acquires the sound recognition information from the sound feature information; and an integration unit that generates integration information by integrating the gesture recognition information and the sound recognition information.

Type: Application

Filed: December 21, 2011

Publication date: June 28, 2012

Applicant: Electronics and Telecommunications Research Institute

Inventors: Mun Sung HAN, Young Giu JUNG, Hyun KIM, Jae Hong KIM, Joo Chan SOHN
Dual-Band Speech Encoding

Publication number: 20120166186

Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.

Type: Application

Filed: December 23, 2010

Publication date: June 28, 2012

Applicant: Microsoft Corporation

Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
LOCATION BASED AUTOMOBILE INSPECTION

Publication number: 20120158238

Abstract: A dynamic inspection system determines a location of a technician with respect to an inspection vehicle and determines active voice commands to which it responds to based on that location. The technician can perform a vehicle inspection by providing voice commands to the dynamic inspection system, which can increase the technician's efficiency. Further, as the dynamic inspection system's active commands are customized for the location of the technician, the dynamic inspection system can filter out sound that does not include active voice commands, potentially increasing the accuracy of its voice recognition capability.

Type: Application

Filed: January 10, 2012

Publication date: June 21, 2012

Inventors: Marcus Isaac Daley, Elias Leonel More Basso
SIGNAL ENCODING APPARATUS AND METHOD, SIGNAL DECODING APPARATUS AND METHOD, PROGRAMS AND RECORDING MEDIUMS

Publication number: 20120158411

Abstract: An encoding apparatus that divides an input time series signal into a plurality of sub-bands and encodes a low frequency sub-band signal to generate encoded data of the low frequency sub-band signal. Concurrently, it compares the frequency amplitude peak of the new high frequency sub-band signal generated from the low frequency sub-band signal and the original high frequency sub-band signal and generates frequency amplitude peak information of the high frequency sub-band signal. It compares the gain of the new high frequency sub-band signal generated by using the low frequency sub-band signal and the original high frequency sub-band signal and generates gain information of the high frequency sub-band signal. Subsequently, the signal encoding apparatus multiplexes the encoded data of the low frequency sub-band signal, the frequency amplitude peak information of the high frequency sub-band signal and the gain information of the high frequency sub-band signal and outputs compressed data.

Type: Application

Filed: February 17, 2012

Publication date: June 21, 2012

Applicant: SONY CORPORATION

Inventors: Toru Chinen, Hiroyuki Honma
VOICE ASSISTANT SYSTEM

Publication number: 20120136667

Abstract: Methods and apparatuses to assist a user in the performance of a plurality of tasks are provided. The invention includes storing at least one care plan for a resident, the care plan defining a plurality of tasks to be performed for providing care to the resident. Capturing speech inputs from the user, and providing speech outputs to the user to provide a speech dialog with the user reflective of the care plan. Information is captured with a contactless communication interface and is used for engaging the care plan.

Type: Application

Filed: February 7, 2012

Publication date: May 31, 2012

Inventors: Charles Thomas Emerick, James R. Logan, Richard Anthony Bates, James Wahl
SYSTEM AND METHOD FOR PROVIDING ENHANCED AUDIO IN A VIDEO ENVIRONMENT

Publication number: 20120120270

Abstract: A method is provided in one example and includes receiving audio data at a microphone array that includes a plurality of microphones. The microphone array is provisioned at a first endpoint, which includes a camera element configured to capture video data associated with a video session involving the first endpoint and a second endpoint. The method also includes formatting the audio data into a time division multiplex (TDM) stream, and communicating the stream to a port for a subsequent communication over a network and to the second endpoint.

Type: Application

Filed: November 15, 2010

Publication date: May 17, 2012

Inventors: Wei Li, J. William Mauchly, David J. Mackie, Olin D. Williford, II, Jinshi Huang, Pawel Paszkowski, Indrajit Rajeev Gajendran, Richard T. Wales, Joseph T. Friel
SPEECH DATA RETRIEVING AND PRESENTING DEVICE

Publication number: 20120116770

Abstract: A speech data retrieving and presenting device applied with an electronic device through a network includes a data receiving unit, a processing unit and a speech presenting unit. The data receiving unit connected to the network receives data of the electronic device through the network. The processing unit coupled to the data receiving unit receives speech data and retrieves a speech presenting signal from the speech data. The speech presenting unit coupled to the processing unit receives the speech presenting signal and outputs a speech according to the speech data. This device can assist a user to obtain network information, and provide the user a more flexible application according to the property that the device can be operated independently by a simple motion.

Type: Application

Filed: November 8, 2010

Publication date: May 10, 2012

Inventors: Ming-Fu CHEN, Cheng-Hsiung Chen, Daow-Ming Jiang, Chan-Fa Chiu, Cheng-Jen Lin, Po-Yiu Liu
Narrative Voice Files for GPS Devices

Publication number: 20120109657

Abstract: The invention consists of a compilation of unified audio tour files in compressed format i.e. MP3 or MP4 that provides pre-recorded spoken commentary to Global Positioning System (GPS) enabled devices. Using satellite technology, audio is triggered based on a user's location, providing relevant facts, geography, points of interest, history, and trivia of every city/town/area as it is being traveled throughout the World. These audio tour files will be provided in multiple languages. Upgrades shall be available via the Internet. The invention will narrate the entire World beginning with the large metropolitan areas of the United States of America, through to the smallest towns in Malta.

Type: Application

Filed: October 30, 2010

Publication date: May 3, 2012

Inventor: Cristian Partan
REAL-TIME NETWORK ATTACK DETECTION AND MITIGATION INFRASTRUCTURE

Publication number: 20120090028

Abstract: The invention features systems and methods for detecting and mitigating network attacks in a Voice-Over-IP (VoIP) network. A server is configured to receive information related to a mitigation action for a call. The information can include a complexity level for administering an audio challenge-response test to the call and an identification of the call. The server also generates i) a routing label based on the identification of the call, and ii) a script defining a plurality of variables that store identifications of a plurality of altered sound files for the audio challenge-response test. Each altered sound file is randomly selected by the server subject to one or more constraints associated with the complexity level. The server is further configured to transmit the script to a guardian module and the routing label to a gateway.

Type: Application

Filed: October 12, 2011

Publication date: April 12, 2012

Inventors: David Lapsley, Wassim Matragi, Miri Mansur, Jonathan Klotzbach, Ti-yuan Dean Shu, Sri Chary, Joby Joseph, Mark Topham, Kenneth Dumble
SPEECH RECOGNITION USER INTERFACE

Publication number: 20120089392

Abstract: Speech recognition techniques are disclosed herein. In one embodiment, a novice mode is available such that when the user is unfamiliar with the speech recognition system, a voice user interface (VUI) may be provided to guide them. The VUI may display one or more speech commands that are presently available. The VUI may also provide feedback to train the user. After the user becomes more familiar with speech recognition, the user may enter speech commands without the aid of the novice mode. In this “experienced mode,” the VUI need not be displayed. Therefore, the user interface is not cluttered.

Type: Application

Filed: October 7, 2010

Publication date: April 12, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Vanessa Larco, Ali M. Vassigh, Alan T. Shen, Christian Klein, Thomas M. Soemo
SYSTEM AND METHOD FOR PERFORMING SPEECH ANALYTICS

Publication number: 20120084081

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing trend analysis of speech. A system practicing the method receives a speech trend analysis request having candidate feature constraints, an objective function with respect to a speech trend to be analyzed, and a set of speech record constraints. The system selects a subset of speech records from the group of speech records based on the set of speech record constraints to yield selected speech records, identifies features in the selected speech records based on the set of candidate feature constraints to yield identified features, and assigns a weight to each of the identified features based on the objective function. Then the system ranks the identified features by their respective weights to yield ranked identified features, and outputs at least one of the ranked identified features associated with a speech-based trend in response to the speech trend analysis request.

Type: Application

Filed: September 30, 2010

Publication date: April 5, 2012

Applicant: AT&T Intellectual Property I, L.P.

Inventors: ILYA Dan MELAMED, Mazin Gilbert
Reality alternate

Publication number: 20120069131

Abstract: Among other things, we describe a reality alternative to our physical reality, named the Expandaverse, that includes multiple digital realities that may be continuously created, broadcast, accessed, and used interactively.

Type: Application

Filed: May 24, 2011

Publication date: March 22, 2012

Inventor: Daniel H. Abelow
Voice Activity Detection Method and Apparatus, and Electronic Device

Publication number: 20120065966

Abstract: A voice activity detection method and apparatus, and an electronic device are provided. The method includes: obtaining a time domain parameter and a frequency domain parameter from an audio frame; obtaining a first distance between the time domain parameter and a long-term slip mean of the time domain parameter in a history background noise frame, and obtaining a second distance between the frequency domain parameter and a long-term slip mean of the frequency domain parameter in the history background noise frame; and judging whether the audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance and a set of decision inequalities based on the first distance and the second distance. The above technical solutions enable the judgment criterion to have an adaptive adjustment capability, thus improving the performance of the voice activity detection.

Type: Application

Filed: November 30, 2011

Publication date: March 15, 2012

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventor: Zhe Wang
DECODING DEVICE AND DECODING METHOD

Publication number: 20120065984

Abstract: Provided is a decoding device that can reduce abrupt changes in the number of channels in a decoded signal when transmission errors occur as a result of lost frames in an encoding/decoding system for multichannel signals. Said decoding device is also capable of per-sample smoothing and can reduce degradation of audio quality. In the provided device, a demultiplexer (301) receives an encoded monaural signal and an encoded differential signal and detects change over time in the received encoded differential signal. An M signal decoder (302) decodes the encoded monaural signal and obtains a decoded monaural signal. An S signal decoder (303) decodes the encoded differential signal and obtains a decoded differential signal. A smoothing unit (304) performs smoothing on the decoded differential signal by means of a computation involving the decoded differential signal and coefficients corresponding to the change over time detected by the demultiplexer (301).

Type: Application

Filed: May 25, 2010

Publication date: March 15, 2012

Applicant: PANASONIC CORPORATION

Inventors: Tomofumi Yamanashi, Masahiro Oshikiri, Hiroyuki Ehara
DYNAMICALLY GENERATING A VOCAL HELP PROMPT IN A MULTIMODAL APPLICATION

Publication number: 20120065982

Abstract: Dynamically generating a vocal help prompt in a multimodal application that include detecting a help-triggering event for an input element of a VoiceXML dialog, where the detecting is implemented with a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the multimodal application has no static help text. Dynamically generating a vocal help prompt in a multimodal application according to embodiments of the present invention typically also includes retrieving, by the VoiceXML interpreter from a source of help text, help text for an element of a speech recognition grammar, forming by the VoiceXML interpreter the help text into a vocal help prompt, and presenting by the multimodal application the vocal help prompt through a computer user interface to a user.

Type: Application

Filed: November 23, 2011

Publication date: March 15, 2012

Applicant: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., David Jaramillo, Yan Li
SYSTEM AND METHOD FOR EXTRACTING A DESTINATION FROM VOICE DATA ORIGINATING OVER A COMMUNICATION NETWORK

Publication number: 20120059579

Abstract: A navigation system for an automotive vehicle capable of communicating with a remote communication device includes a host communication device connected to the remote communication device over a communication network to transmit voice data. A hands free communication unit having a speaker and a microphone is configured to connect with the host communication device so as to communicate with the remote communication device through the speaker and microphone. The vehicle navigation system also includes a voice recognition engine in communication with the hands free communication unit through a voice data link. A route generation unit is connected to the hands free communication unit and the voice recognition engine. The hands free communication unit is capable of transmitting the voice data originating from the remote communication device to the voice recognition engine over the voice data link.

Type: Application

Filed: September 7, 2010

Publication date: March 8, 2012

Applicant: Toyota Motor Engineering & Manufacturing North America, Inc.

Inventors: Jeffrey Edward Pierfelice, Eric Randell Schmidt
Hands-Free, Eyes-Free Mobile Device for In-Car Use

Publication number: 20120052907

Abstract: In one embodiment, a method determines an event at a mobile device and a movement value for a speed of movement of the mobile device based on the event. The movement value is compared to a threshold. If the movement value has passed the threshold, the method enables a mode such that the mobile device is configured to announce information to a user of the mobile device and configured to receive an audible command from the user of the mobile device.

Type: Application

Filed: August 30, 2010

Publication date: March 1, 2012

Applicant: SENSORY, INCORPORATED

Inventors: James C. Gilbreath, Todd F. Mozer
WEB BROWSER IMPLEMENTATION OF INTERACTIVE VOICE RESPONSE INSTRUCTIONS

Publication number: 20120053947

Abstract: Web browser implementable instructions are generated from interactive voice instructions that are not natively interpreted by web browsers. Generating web browser implementable instructions in this manner allows for faster and cheaper deployment of voice, video, and/or data services by allowing legacy services based on interactive voice instructions to function seamlessly within an all data network.

Type: Application

Filed: August 25, 2010

Publication date: March 1, 2012

Applicant: OPENWAVE SYSTEMS INC.

Inventor: Kim Quo-That Liu
DIGITAL VOICE COMMUNICATION CONTROL DEVICE AND METHOD

Publication number: 20120046941

Abstract: A digital audio communication control apparatus includes a first mixing unit that mixes a voice input from a voice input unit and uttered by a specific speaker with a voice input from a digital audio packet receiving unit and uttered by at least one speaker except for the specific speaker, and a second mixing unit that mixes the voices mixed by the first mixing unit with the voice of the specific speaker. The voices mixed by the second mixing unit are fed back to the specific speaker.

Type: Application

Filed: April 27, 2010

Publication date: February 23, 2012

Applicant: Panasonic Corporation

Inventor: Akihiro Tanaka
REMOTE CONTROL SYSTEM AND METHOD

Publication number: 20120046952

Abstract: A remote control system includes a receiving and recognition module, a converting module, and a control interface module. The receiving and recognition is used for receiving a signal from a user and recognizing the signal as a user command associated with an electronic device. The converting module is used for converting the user command into a control command identifiable by the electronic device. The control interface module is used for sending the control command to the electronic device to control the electronic device.

Type: Application

Filed: May 23, 2011

Publication date: February 23, 2012

Applicant: HON HAI PRECISION INDUSTRY CO., LTD.

Inventors: EN-WEI HSU, CHIA-HUNG CHIEN
APPARATUS AND METHOD FOR RECOGNIZING VOICE COMMAND

Publication number: 20120035935

Abstract: An apparatus and method for recognizing a voice command for use in an interactive voice user interface are provided. The apparatus includes a command intention belief generation unit that is configured to recognize a first voice command and that may generate one or more command intention beliefs for the first voice command. The apparatus also includes a command intention belief update unit that is configured to update each of the command intention beliefs based on a system response to the first voice command and a second voice commands. The apparatus also includes a command intention belief selection unit that is configured to select one of the updated command intention beliefs for the first voice command. The apparatus also includes an operation signal output unit that is configured to select a final command intention from the selected updated command intention belief and to output an operation signal based on the selected final command intention.

Type: Application

Filed: April 26, 2011

Publication date: February 9, 2012

Applicant: Samsung Electronics Co., Ltd.

Inventors: Chi-Youn Park, Byung-Kwan Kwak, Jeong-Su Kim, Jeong-Mi Cho
DISAMBIGUATING INPUT BASED ON CONTEXT

Publication number: 20120035924

Abstract: In one implementation, a computer-implemented method includes receiving, at a mobile computing device, ambiguous user input that indicates more than one of a plurality of commands; and determining a current context associated with the mobile computing device that indicates where the mobile computing device is currently located. The method can further include disambiguating the ambiguous user input by selecting a command from the plurality of commands based on the current context associated with the mobile computing device; and causing output associated with performance of the selected command to be provided by the mobile computing device.

Type: Application

Filed: July 20, 2011

Publication date: February 9, 2012

Applicant: GOOGLE INC.

Inventors: John Nicholas JITKOFF, Michael J. LEBEAU
METHOD AND SYSTEM FOR DISTRIBUTED AUDIO TRANSCODING IN PEER-TO-PEER SYSTEMS

Publication number: 20120029911

Abstract: A method for streaming audio data in a network, the audio data having a sequence of samples, includes encoding the sequence of samples into a plurality of coded base bitstreams, generating a plurality of enhancement streams, and transmitting the coded base bitstreams and the enhancement bitstreams to a receiver for decoding. Each of the enhancement bitstreams is generated from one of a plurality of non-overlapping portions of the sequence of samples.

Type: Application

Filed: July 30, 2010

Publication date: February 2, 2012

Applicants: STANFORD UNIVERSITY, DEUTSCHE TELEKOM AG

Inventors: Jeonghun NOH, Bernd GIROD, Peter POGRZEBA, Sachin Kumar AGARWAL, Jatinder Pal SINGH, Kyu-Han KIM
SPEECH RECOGNITION SYSTEM AND METHOD

Publication number: 20120029921

Abstract: According to the present invention, a method for integrating processes with a multi-faceted human centered interface is provided. The interface is facilitated to implement a hands free, voice driven environment to control processes and applications. A natural language model is used to parse voice initiated commands and data, and to route those voice initiated inputs to the required applications or processes. The use of an intelligent context based parser allows the system to intelligently determine what processes are required to complete a task which is initiated using natural language. A single window environment provides an interface which is comfortable to the user by preventing the occurrence of distracting windows from appearing. The single window has a plurality of facets which allow distinct viewing areas. Each facet has an independent process routing its outputs thereto. As other processes are activated, each facet can reshape itself to bring a new process into one of the viewing areas.

Type: Application

Filed: July 8, 2011

Publication date: February 2, 2012

Applicant: Nuance Communications, Inc.

Inventors: Richard Grant, Pedro E. McGregor
METHOD OF ACCESSING A DIAL-UP SERVICE

Publication number: 20120029922

Abstract: A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.

Type: Application

Filed: October 3, 2011

Publication date: February 2, 2012

Applicant: AT&T Intellectual Property II, L.P.

Inventor: Robert Wesley Bossemeyer, JR.
DISAMBIGUATION OF CONTACT INFORMATION USING HISTORICAL DATA

Publication number: 20120022874

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar.

Type: Application

Filed: September 30, 2011

Publication date: January 26, 2012

Applicant: Google Inc.

Inventors: Matthew I. Lloyd, Willard Van Tuyl Rusch, II
System And Method For Identifying Audio Command Prompts For Use In A Voice Response Environment

Publication number: 20120020466

Abstract: A system and method for identifying audio command prompts for use in a voice response environment is provided. A signature is generated for audio samples each having preceding audio, reference phrase audio, and trailing audio segments. The trailing segment is removed and each of the preceding and reference phrase segments are divided into buffers. The buffers are transformed into discrete fourier transform buffers. One of the discrete fourier transform buffers from the reference phrase segment that is dissimilar to each of the discrete fourier transform buffers from the preceding segment is selected as the signature. Audio command prompts are processed to generate a discrete fourier transform. Each discrete fourier transform for the audio command prompts is compared with each of the signatures and a correlation value is determined. One such audio command prompt matches one such signature when the correlation value for that audio command prompt satisfies a threshold.

Type: Application

Filed: October 3, 2011

Publication date: January 26, 2012

Inventor: Martin R.M. Dunsmuir
SYNCHRONIZING VISUAL AND SPEECH EVENTS IN A MULTIMODAL APPLICATION

Publication number: 20120022875

Abstract: Exemplary methods, systems, and products are disclosed for synchronizing visual and speech events in a multimodal application, including receiving from a user speech; determining a semantic interpretation of the speech; calling a global application update handler; identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation; and executing the additional function. Typical embodiments may include updating a visual element after executing the additional function. Typical embodiments may include updating a voice form after executing the additional function. Typical embodiments also may include updating a state table after updating the voice form. Typical embodiments also may include restarting the voice form after executing the additional function.

Type: Application

Filed: September 30, 2011

Publication date: January 26, 2012

Applicant: Nuance Communications, Inc.

Inventors: Charles W. Cross, JR., Michael C. Hollinger, Igor R. Jablokov, Benjamin D. Lewis, Hilary A. Pike, Daniel M. Smith, David W. Wintermute, Michael A. Zaitzeff
Voice Actions on Computing Devices

Publication number: 20120022876

Abstract: A computer-implemented method includes receiving spoken input at a computing device from a user of the computing device, the spoken input including a carrier phrase and a subject to which the carrier phrase is directed, providing at least a portion of the spoken input to a server system in audio form for speech-to-text conversion by the server system, the portion including the subject to which the carrier phrase is directed, receiving from the server system instructions for automatically performing an operation on the computing device, the operation including an action defined by the carrier phrase using parameters defined by the subject, and automatically performing the operation on the computing device.

Type: Application

Filed: September 30, 2011

Publication date: January 26, 2012

Applicant: GOOGLE INC.

Inventors: Michael J. LeBeau, John Nicholas Jitkoff
Spectrum Flatness Control for Bandwidth Extension

Publication number: 20120016667

Abstract: In accordance with an embodiment, a method of decoding an encoded audio bitstream at a decoder includes receiving the audio bitstream, decoding a low band bitstream of the audio bitstream to get low band coefficients in a frequency domain, and copying a plurality of the low band coefficients to a high frequency band location to generate high band coefficients. The method further includes processing the high band coefficients to form processed high band coefficients. Processing includes modifying an energy envelope of the high band coefficients by multiplying modification gains to flatten or smooth the high band coefficients, and applying a received spectral envelope decoded from the received audio bitstream to the high band coefficients. The low band coefficients and the processed high band coefficients are then inverse-transformed to the time domain to obtain a time domain output signal.

Type: Application

Filed: July 18, 2011

Publication date: January 19, 2012

Applicant: FutureWei Technologies, Inc.

Inventor: Yang Gao
Energy Envelope Perceptual Correction for High Band Coding

Publication number: 20120016668

Abstract: In accordance with an embodiment, A method of encoding an audio bitstream at an encoder includes encoding an original low band signal at the encoder by using a closed loop analysis-by-synthesis approach to obtain a coded low band signal, encoding an original high band signal at the encoder by using an open loop energy matching approach to obtain coded high band energy envelopes, comparing an energy of the coded low band signal with an energy of a corresponding original low band signal for a subframe, and generating an indication flag that indicates whether an energy envelope perceptual correction is needed for the subframe based on comparing the energy.

Type: Application

Filed: July 19, 2011

Publication date: January 19, 2012

Applicant: FutureWei Technologies, Inc.

Inventor: Yang Gao
Intelligent Automated Assistant

Publication number: 20120016678

Abstract: An intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.

Type: Application

Filed: January 10, 2011

Publication date: January 19, 2012

Applicant: APPLE INC.

Inventors: Thomas Robert Gruber, Adam John Cheyer, Dag Kittlaus, Didier Rene Guzzoni, Christopher Dean Brigham, Richard Donald Giuli, Marcello Bastea-Forte, Harry Joseph Saddler
POWER-OPTIMIZED WIRELESS COMMUNICATIONS DEVICE

Publication number: 20120010890

Abstract: The present invention is an Always On, Hands-free, Speech Activated, Power-optimized Wireless Communications Device with associated base. The unique value of the device is that a person can use the device at any time, 24×7, with hands-free operation. People can wear it 24×7 on their body either around their neck or on their wrist or wherever it best meets their needs. Speech activation provides greater convenience for the person in using the wireless communications device, and at the same time, it allows the microcontroller greater control of power consuming resources. The wireless communications device may host simple, low power applications. In addition, applications will reside in the base, and in an application (either voice or data) server that is accessed by the wireless communications base.

Type: Application

Filed: December 30, 2009

Publication date: January 12, 2012

Inventor: Raymond Clement Koverzin
Home star security system

Publication number: 20120008751

Abstract: The Home Star is an interactive security and communication system for the home which is wired to doors, windows and other points of entrance, with infrared motion sensors included to detect home invaders. Once tripped, this alarm sends a silent signal to the authorities and the Home Star command center, ensuring that police back up and support is immediately directed to the location. More than a mere security system, the Home Star provides a complete communication system for the home enabling occupants to place and receive telephone calls, print documents and road maps and access other family members, all with a simple press of the activation switch. This in turn activates an interactive system of voice prompts which enables the user to execute any number of function specific commands.

Type: Application

Filed: July 5, 2011

Publication date: January 12, 2012

Inventor: James Forbes
VOICE INTEGRATION PLATFORM

Publication number: 20120010876

Abstract: A voice integration platform and method provide for integration of a voice interface with a data system that includes stored data. The voice integration platform comprises one or more generic software components, the generic software components being configured to enable development of a specific voice user interface that is designed to interact with the data system in order to present the stored data to a user.

Type: Application

Filed: September 22, 2011

Publication date: January 12, 2012

Applicant: Ben Franklin Patent Holding LLC

Inventors: Andrew G. Smolenski, Steven Markman, Pericles Haleftiras, Jon Thomas Layton, Lizanne Kaiser, Gregory S. Kluthe, Michael W. Achenbach
VOICE INTERACTION METHOD OF MOBILE TERMINAL BASED ON VOICEXML AND MOBILE TERMINAL

Publication number: 20120010889

Abstract: The present invention discloses a voice interaction method of a mobile terminal based on VoiceXML and a mobile terminal, which comprises: converting received voice information into a VoiceXML document, parsing the VoiceXML document according to a preset VoiceXML document framework, searching the information of the function which needs to be realized by the voice information corresponding to the VoiceXML document; mapping found function information to the function corresponding to the particular function of the man-machine interface, and informing the mapped function to the man-machine interface; performing VoiceXML response document conversion on the response information from the man-machine interface, and playing the conversion result via a corresponding voice information. According to the technical solution of the present invention, the advanced intelligence and complex voice interaction can be realized, and the transportability of voice interaction is improved.

Type: Application

Filed: September 24, 2009

Publication date: January 12, 2012

Inventors: Dongzhou Lian, Xuesen Yang, Halyong Peng, Guang Chen
Audible post-it system

Publication number: 20120004917

Abstract: An audible post-it system includes a post-it note printed with an index and an optical reading and recording device having an optical module, a switch, a storage device, an audio recording device, an audio playing device and a processor. The optical reading and recording device reads an image of the index. When the optical reading and recording device is at a recoding state, the processor receives the image of the index and obtains the index, then receives a digital audio outputted by the audio recording device to match the index with the digital audio, and stores the digital audio based on the index. When the optical reading and recording device is at a playing state, the processor receives the image of an index and retrieves the index, then reads a digital audio based on the index, and sends the digital audio to the audio playing device for playing.

Type: Application

Filed: October 21, 2010

Publication date: January 5, 2012

Applicant: Generalplus Technology Inc.

Inventor: Ching-Fu HUNG
Method and Apparatus for Providing Metadata-Based User Interfaces

Publication number: 20120005592

Abstract: Methods and apparatus provide for the production of metadata-based user interfaces (UIs) such as graphical user interfaces (GUIs). In one example, keypad descriptor metadata is obtained. The keypad descriptor metadata is data identifying a plurality of available keypad GUIs for a particular data field to control the change from a first keypad GUI to a different keypad GUI. The first keypad GUI is provided for the data field based on the obtained keypad descriptor metadata. A second and different keypad GUI is also provided for the same data field based on the keypad descriptor metadata during the same field population session. In another example, a user interface is provided for a device. The user interface is changed based on a current machine state of an input/output function of the device and based on user interface descriptor metadata associated with an element of the user interface.

Type: Application

Filed: June 30, 2010

Publication date: January 5, 2012

Inventor: Shrinivas B. Joshi
Page identification method for audio book

Publication number: 20110320208

Abstract: A page identification method for audio book with a main housing, a plurality of pages, a plurality of light blocking panels, an audio record and playback electronic circuit including microphone, speaker, power source, record switch and playback switch, a microprocessor and a plurality of light sensing devices. The top surface of the main body has a plurality of apertures. Each light sensing device located directly under each main body aperture. Each page has one or more apertures that are aligned with at least one of the main body apertures. Each light blocking panel is interleaved between each page so that when the page is turned by the user the light blocking panel will slide over to cover or uncover the page aperture causing the light sensing devices to send a signal to the microprocessor that tells the audio circuit which message to play for each page.

Type: Application

Filed: June 24, 2010

Publication date: December 29, 2011

Inventor: Ki Kin Wong
METHOD AND APPARATUS FOR ADAPTIVELY ENCODING AND DECODING HIGH FREQUENCY BAND

Publication number: 20110313778

Abstract: Provided are a method and apparatus for encoding and decoding an audio signal. According to the present application, a signal of a high frequency band above a preset frequency band is adaptively encoded or decoded in the time domain or in the frequency domain by using a signal of a low frequency band below the preset frequency band. As such, the sound quality of a high frequency signal is not deteriorate even when an audio signal is encoded or decoded by using a small number of bits and thus coding efficiency may be maximized.

Type: Application

Filed: August 29, 2011

Publication date: December 22, 2011

Applicant: Samsung Electronics Co., Ltd

Inventors: Chang-yong Son, Eun-mi Oh, Ki-hyun Choo, Jung-hoe Kim
Method for changing the caller voice during conversation in voice communication device

Publication number: 20110313759

Abstract: The invention relates to a cellular phone terminal system and in particular to a method for changing caller's voice of speech signal during conversation. The cellular phone terminal system has a filter for filtering signal. The method comprises the steps of: waiting for a caller voice selector key input for a desired caller voice when a caller voice converter key is pressed during conversation; and setting an even or odd harmonic deletion bins on the frequency domain of the uncompressed speech signal correspondingly to the caller voice selector key input to change caller voice.

Type: Application

Filed: June 16, 2011

Publication date: December 22, 2011

Inventor: Alon Konchitsky
Modular Speech Recognition Architecture

Publication number: 20110307250

Abstract: A speech recognition system is provided. The speech recognition system includes a speech recognition module; a plurality of domain specific dialog manager modules that communicate with the speech recognition module to perform speech recognition; and a speech interface module that that communicates with the plurality of domain specific dialog manager modules to selectively enable the speech recognition.

Type: Application

Filed: June 10, 2010

Publication date: December 15, 2011

Applicant: GM Global Technology Operations, Inc.

Inventor: Robert D. Sims
ENCODER, DECODER, AND METHOD THEREFOR

Publication number: 20110307248

Abstract: Provided is an encoder which can effectively encode/decode spectrum data of a broad frequency signal in a high frequency range, can dramatically reduce the number of the arithmetic operations to be performed, and can improve the quality of the decoded signal. The encoder comprises a first layer coding unit (202) which encodes an input signal in a low frequency range below a predetermined frequency to generate first coded information, a first layer decoding unit (203) which decodes the first coded information to generate a decoded signal, and a second layer coding unit (206) which splits the input signal in a high frequency range above a predetermined frequency, into a plurality of sub-bands, presumes the respective sub-hands from the input signal or decoded signal, partially selects a spectrum component within each sub-band, and calculates an amplitude adjustment parameter used to adjust the amplitude of the selected spectrum component to thereby generate second coding information.

Type: Application

Filed: February 25, 2010

Publication date: December 15, 2011

Applicant: PANASONIC CORPORATION

Inventors: Tomofumi Yamanashi, Masahiro Oshikiri, Hiroyuki Ehara
MULTI-MODAL GENDER RECOGNITION

Publication number: 20110307260

Abstract: Gender recognition is performed using two or more modalities. For example, depth image data and one or more types of data other than depth image data is received. The data pertains to a person. The different types of data are fused together to automatically determine gender of the person. A computing system can subsequently interact with the person based on the determination of gender.

Type: Application

Filed: June 11, 2010

Publication date: December 15, 2011

Inventors: Zhengyou Zhang, Alex Aben-Athar Kipman
METHOD AND APPARATUS FOR ENCODING AND DECODING AUDIO SIGNAL USING ADAPTIVE SINUSOIDAL CODING

Publication number: 20110301961

Abstract: A method and an apparatus for encoding and decoding audio signals using adaptive sinusoidal coding are provided. The audio signal encoding method includes the steps of dividing a synthesized audio signal into a plurality of sub-bands, calculating the energy of each sub-band, selecting a predetermined number of sub-bands having a relatively large amount of energy from the sub-bands, and performing sinusoidal coding with regard to the selected sub-bands. Application of sinusoidal coding based on consideration of the amount of energy of each sub-band of the synthesized signal improves the quality of the synthesized signal more efficiently.

Type: Application

Filed: February 16, 2010

Publication date: December 8, 2011

Inventors: Mi-Suk Lee, Hyun-Joo Bae, Byung-Sun Lee
System-Initiated Speech Interaction

Publication number: 20110301958

Abstract: Whenever an event occurs on a computing system which will accept a response from a user of the system, the system automatically determines whether or not to enable speech interaction with the system for the event response. Whenever speech interaction is enabled with the system for the event response, the system provides a notification to the user which informs the user of the event and their options for responding thereto, where these options include responding verbally. Whenever the user responds within a prescribed period of time via a voice command (VC), the system attempts to recognize the VC. Whenever the VC is successfully recognized, the system responds appropriately to the VC.

Type: Application

Filed: June 4, 2010

Publication date: December 8, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Alice Jane Bernheim Brush, Paul Johns, Jen Anderson, Connie Missimer, Seung Yang, Jean Ku

prev 1 2 3 4 5 6 7 8 9 … next