Miscellaneous Analysis Or Detection Of Speech Characteristics (epo) Patents (Class 704/E11.001)

E Subclasses

General speech analysis without concrete application (epo) (Class 704/E11.002)

Detection of presence or absence of speech signals (epo) (Class 704/E11.003)

Pitch determination of speech signals (epo) (Class 704/E11.006)

Voiced-unvoiced decision (epo) (Class 704/E11.007)

Method for Interacting With Users of Speech Recognition Systems

Publication number: 20090216538

Abstract: A computer implemented method facilitates a user interaction via a speech-based user interface. The method acquires spoken input from a user in a form of a phrase of one or more words. It further determines, using a plurality of different domains) whether the phrase is a query or a command. If the phrase is the query the method retrieves and presents relevant items from a plurality of databases. If the phrase is a command, the method performs an operation.

Type: Application

Filed: February 25, 2008

Publication date: August 27, 2009

Inventors: Garrett Weinberg, Bhiksha Ramakrishnan, Bent Schmidt-Nielsen, Bret A. Harsham
STRAINED-ROUGH-VOICE CONVERSION DEVICE, VOICE CONVERSION DEVICE, VOICE SYNTHESIS DEVICE, VOICE CONVERSION METHOD, VOICE SYNTHESIS METHOD, AND PROGRAM

Publication number: 20090204395

Abstract: A strained-rough-voice conversion unit (10) is included in a voice conversion device that can generate a “strained rough” voice produced in a part of a speech when speaking forcefully with excitement, nervousness, anger, or emphasis and thereby richly express vocal expression such as anger, excitement, or an animated or lively way of speaking, using voice quality change. The strained-rough-voice conversion unit (10) includes: a strained phoneme position designation unit (11) designating a phoneme to be uttered as a “strained rough” voice in a speech; and an amplitude modulation unit (14) performing modulation including periodic amplitude fluctuation on a speech waveform.

Type: Application

Filed: January 22, 2008

Publication date: August 13, 2009

Inventors: Yumiko Kato, Takahiro Kamai
SYSTEMS AND METHODS FOR INPUTTING GRAPHICAL DATA INTO A GRAPHICAL INPUT FIELD

Publication number: 20090199101

Abstract: A system (20) for inputting graphical data into a graphical input field includes a graphical input device (22) for inputting the graphical data into the graphical input field, and a processor-executable voice-form module (28) responsive to an initial presentation of graphical data to the graphical input device. The voice-form module (28) causes a determination of whether the inputting of the graphical data into the graphical input field is complete. A method for inputting graphical data into a graphical input field includes initiating an input of graphical data via a graphical input device into the graphical input field, and actuating a voice-form module in response to initiating the input of graphical data into the graphical input field.

Type: Application

Filed: January 30, 2009

Publication date: August 6, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Charles W. Cross, JR., David Jaramillo, Marc White
METHOD AND SYSTEM FOR CAPABILITIES LEARNING

Publication number: 20090192798

Abstract: A method for task execution improvement, the method includes: generating a baseline model for executing a task; recording a user executing a task; comparing the baseline model to the user's execution of the task; and providing feedback to the user based on the differences in the user's execution and the baseline model.

Type: Application

Filed: January 25, 2008

Publication date: July 30, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sara H. Basson, Dimitri Kanevsky, Edward E. Kelley, Bhuvana Ramabhadran
Breathing Apparatus Speech Enhancement

Publication number: 20090192799

Abstract: Speech enhancement in a breathing apparatus is provided using a primary sensor mounted near a breathing mask user's mouth, at least one reference sensor mounted near a noise source, and a processor that combines the signals from these sensors to produce an output signal with an enhanced speech component. The reference sensor signal may be filtered and the result may be subtracted from the primary sensor signal to produce the output signal with an enhanced speech component. A method for detecting the exclusive presence of a low air alarm noise may be used to determine when to update the filter. A triple filter adaptive noise cancellation method may provide improved performance through reduction of filter maladaptation. The speech enhancement techniques may be employed as part of a communication system or a speech recognition system.

Type: Application

Filed: January 29, 2008

Publication date: July 30, 2009

Applicant: Digital Voice Systems, Inc.

Inventors: Daniel W. Griffin, John C. Hardwick
Using Homophones and Near-Homophones to Improve Methods of Computer Text Entry for Chinese Characters

Publication number: 20090187399

Abstract: The invention allows phonetic text input without any knowledge of phonetics. As an assist to the user of computer text entry systems, the invention makes possible an alternative method of Chinese character entry by entering a Chinese character assumed by the user to be a homophone of the character the user desires to enter. Entry methods for such homophone alternative entry include non-phonetic entry of Chinese characters using keyboard stroke input and single stroke, cursive and semi-cursive entry on an electronic surface. Direct correction of some misspellings of Chinese characters during phonetic entry also is made possible. The invention is not only helpful for entry of difficult Chinese characters but also provides an approach to the use of supplementing input methods for most if not all written languages.

Type: Application

Filed: January 22, 2008

Publication date: July 23, 2009

Inventor: Robert B. O'Dell
Direct Message Playback and Recording Apparatus and Method

Publication number: 20090164220

Abstract: A sound recording and playback apparatus and associated method, comprising: an audio storage medium; a microphone; a speaker; and a plurality of direct message access buttons, each direct message access button simultaneously associated both with a particular pre-recorded sound sequence stored in the storage medium, and with a particular new sound sequence capable of being recorded into the storage medium; wherein: when a particular direct message access button is depressed in a manner which respectively designates pre-recorded playback, new sound sequence recording, or new sound sequence playback, the particular pre-recorded sound sequence associated with the particular direct message access button is respectively audibly played over the speaker or recorded into the storage medium, as appropriate.

Type: Application

Filed: February 23, 2009

Publication date: June 25, 2009

Inventor: Howard M. Katz
Advertising method and apparatus

Publication number: 20090144131

Abstract: In an apparatus that creates and distributes voice advertisements to end users, the apparatus includes a voice advertising portal and a network coupled to the voice advertising portal. A voice application server determines which portions of a voice advertisement will be cached locally at the voice advertising portal for subsequent local retrieval during user interaction with the voice advertising portal.

Type: Application

Filed: July 28, 2008

Publication date: June 4, 2009

Inventors: Leo Chiu, Donald R. Steul, Arumugam Appadurai
SYSTEM AND METHOD FOR ENABLING VOICE DRIVEN INTERACTIONS AMONG MULTIPLE IVR'S, CONSTITUTING A VOICE WORKFLOW

Publication number: 20090138269

Abstract: A method for enabling voice driven interactions among multiple interactive voice response (IVR) systems begins by receiving a telephone call from a user of a first IVR system to begin a transaction; and, automatically contacting, by the first IVR system, at least one additional IVR system. Specifically, the contacting of the additional IVR system includes assigning tasks to the additional IVR system. The tasks require input from the user and the additional IVR system is secure and separate from the first IVR system. Moreover, the tasks can include a transfer of currency and a transfer of local information.

Type: Application

Filed: November 28, 2007

Publication date: May 28, 2009

Inventors: Sheetal K. Agarwal, Dipanjan Chakraborty, Arun Kumar, Amit A. Nanavati, Nitendra Rajput
Command and control of devices and applications by voice using a communication base system

Publication number: 20090132256

Abstract: A first communication path for receiving a communication is established. The communication includes speech, which is processed. A speech pattern is identified as including a voice-command. A portion of the speech pattern is determined as including the voice-command. That portion of the speech pattern is separated from the speech pattern and compared with a second speech pattern. If the two speech patterns match or resemble each other, the portion of the speech pattern is accepted as the voice-command. An operation corresponding to the voice-command is determined and performed. The operation may perform an operation on a remote device, forward the voice-command to a remote device, or notify a user. The operation may create a second communication path that may allow a headset to join in a communication between another headset and a communication device, several headsets to communicate with each other, or a headset to communicate with several communication devices.

Type: Application

Filed: January 25, 2008

Publication date: May 21, 2009

Applicant: Embarq Holdings Company, LLC

Inventors: Erik Geldbach, Kelsyn D. Rooks, SR., Shane M. Smith, Mark Wilmoth
ENSURING PRODUCT CORRECTNESS IN A MULTILINGUAL ENVIRONMENT

Publication number: 20090119092

Abstract: A language package system that prevents undesirable behaviors resulting from an incompatibility between a core package of a software product and its language packages is provided. The language package system executes when a user starts the execution of the core package on a computing device. The language package system retrieves a language package version number from the core package that indicates the version number of compatible language packages and an indication of the preferred language of the user. The language package system then determines whether the computing device has a compatible language package that is available. When the computing device has a compatible language package, the software product uses that language package. When the computing device has no compatible language package, the language package system then performs processing that factors in the unavailability of a compatible language package.

Type: Application

Filed: November 1, 2007

Publication date: May 7, 2009

Applicant: Microsoft Corporation

Inventors: Balaji Balasubramanyan, Dmitri Davydok
Handheld Electronic Device and Method for Disambiguation of Compound Text Input and for Prioritizing Compound Language Solutions According to Completeness of Text Components

Publication number: 20090112578

Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound text input. The device is able to assemble language objects in the memory to generate compound language solutions. The device is able to prioritize compound language solutions according to various criteria, including the degree of completeness of the text components of a compound language solution.

Type: Application

Filed: December 30, 2008

Publication date: April 30, 2009

Inventors: Vadim Fux, Michael G. Elizarov
Talking Glove

Publication number: 20090112601

Abstract: A human interface device for assisting the verbally challenged to record custom messages and play back the custom and pre-recorded messages through a sequence of simple finger movements. A data glove containing Hall Effect and Bend Resistor is warn by the user and connected to the Voice Module. The data glove is designed to capture and translate the sequence of finger movements into actions, then transmits the actions to the Voice Module. When a pause is sensed in the actions, the Voice Module links these actions to the custom or pre-recorded messages. These messages are then played on the Voice Module allowing people in close proximity to hear. A Remote Voice Monitor may also be wirelessly connected to the Voice Module to allow remote monitoring.

Type: Application

Filed: October 25, 2007

Publication date: April 30, 2009

Inventor: Larry Don Fullmer
EARLY DIAGNOSIS OF DEMENTIA

Publication number: 20090099848

Abstract: The present invention is an innovative system and method for passive diagnosis of dementias. The disclosed invention enables early diagnosis of and assessments of the efficacy of medications for neural disorders which are characterized by progressive linguistic decline and circadian speech-rhythm disturbances. Clinical and psychometric indicators of dementias are automatically identified by longitudinal statistical measurements and track the nature of language change and/or patient audio features change using mathematical methods. According to embodiments of the present invention the disclosed system and method include multi-layer processing units wherein initial processing of the recorded audio data is processed in a local unit. Processed and required raw data is also transferred to a central unit which performs in-depth analysis of the audio data.

Type: Application

Filed: October 16, 2007

Publication date: April 16, 2009

Inventors: Moshe Lerner, Ofer Bahar
RAPID AUTOMATIC USER TRAINING WITH SIMULATED BILINGUAL USER ACTIONS AND RESPONSES IN SPEECH-TO-SPEECH TRANSLATION

Publication number: 20090089066

Abstract: A system and method for automatic user training in speech-to-speech translation includes integrating an automatic user response system configured to be responsive to a plurality of training items and selecting a training item from the plurality of training items. For the selected training item, in response to an utterance in a first language, the utterance is translated into a second language, and a response to the utterance in the second language is generated. A simulated action corresponding with the response in accordance with a user speaking the second language is also generated. The response and simulated action are output as a learning exercise for learning operations of the automatic user response system.

Type: Application

Filed: October 2, 2007

Publication date: April 2, 2009

Inventors: YUQING GAO, Liang Gu, Wei Zhang
PUBLIC SPEAKING SELF-EVALUATION TOOL

Publication number: 20090089062

Abstract: A public speaking self-evaluation tool that helps a user practice public speaking in terms of avoiding undesirable words or sounds, maintaining a desirable speech rhythm, and ensuring that the user is regularly glancing at the audience. The system provides a user interface through which the user is able to define the undesirable words or sounds that are to be avoided, as well as a maximum frequency of occurrence threshold to be used for providing warning signals based on detection of such filler or undesirable words or sounds. The user interface allows a user to define a speech rhythm, e.g. in terms of spoken syllables per minute, that is another maximum threshold for providing a visual warning indication. The disclosed system also provides a visual indication when the user fails to glance at the audience at least as often as defined by a predefined minimum threshold.

Type: Application

Filed: October 1, 2007

Publication date: April 2, 2009

Inventor: Fang Lu
Device and Method For Frame Lost Concealment

Publication number: 20090089050

Abstract: A device and a method for frame lost concealment are disclosed. A pitch period of a current lost frame is obtained on the basis of a pitch period of the last good frame before the current lost frame. An excitation signal of the current lost frame is recovered on the basis of the pitch period of the current lost frame and an excitation signal of the last good frame before the lost frame. Thereby, the hearing contrast of a receiver is reduced, and the quality of speech is improved. Further, in the present invention, a pitch period of continual lost frames is adjusted on the basis of the change trend of the pitch period of the last good frame before the lost frame. Therefore, a buzz effect produced by the continual lost frames is avoided, and the quality of speech is further improved.

Type: Application

Filed: December 8, 2008

Publication date: April 2, 2009

Applicant: Huawei Technologies Co., Ltd.

Inventors: Yunneng Mo, Yulong Li, Fanrong Tang
RETRIEVING APPARATUS, RETRIEVING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20090083029

Abstract: A word coinciding with a key word input by speech and a word related to the word are set as retrieval candidate words based on a word dictionary in which words representing formal names and aliases of the formal names are registered in association with a family attribute indicating a familiar relation among the words. Content related to any one of retrieval words selected out of the retrieval candidate words and a word related to the retrieval word is retrieved.

Type: Application

Filed: February 29, 2008

Publication date: March 26, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Miwako Doi, Kaoru Suzuki, Toshiyuki Koga, Koichi Yamamoto
Control of plurality of target systems

Publication number: 20090076827

Abstract: A system for controlling or operating a plurality of target systems via spoken commands is provided. The system includes a first plurality of target systems, a second plurality of controllers for controlling or operating target systems via spoken commands, a speech recognition system that stores interface information that is specific to a target system or a group of target systems that are to be controlled or operated. A first controller in the second plurality of controllers includes a microphone for picking up audible signals in the vicinity of the first controller and a device for transmitting the audible signals to a speech recognition system. The speech recognition system is operable to analyze the interface information to recognize spoken commands issued for controlling or operating said target system.

Type: Application

Filed: September 11, 2008

Publication date: March 19, 2009

Inventors: Clemens Bulitta, Robert Kagermeier, Dietmar Sierk
EMPHASIS OF SHORT-DURATION TRANSIENT SPEECH FEATURES

Publication number: 20090076806

Abstract: A sound processor including a microphone (1), a pre-amplifier (2), a bank of N parallel filters (3), means for detecting short-duration transitions in the envelope signal of each filter channel, and means for applying gain to the outputs of these filter channels in which the gain is related to a function of the second-order derivative of the slow-varying envelope signal in each filter channel, to assist in perception of low-intensity sort-duration speech features in said signal.

Type: Application

Filed: October 28, 2008

Publication date: March 19, 2009

Inventors: Andrew E. Vandali, Graeme M. Clark
METHOD AND DEVICE FOR PERFORMING FRAME ERASURE CONCEALMENT ON HIGHER-BAND SIGNAL

Publication number: 20090076808

Abstract: A method for performing a frame erasure concealment for a higher-band signal involves calculating a periodic intensity of the higher-band signal with respect to pitch period information of a lower-band signal; comparing the periodic intensity to a preconfigured threshold and, if the periodic intensity is greater or equal to the preconfigured threshold, performing the frame erasure concealment with a pitch period repetition based method. If the periodic intensity is less than the preconfigured threshold, performing the frame erasure concealment with a previous frame data repetition based method. A device for performing a frame erasure concealment includes a periodic intensity calculation module, a pitch period repetition module, and a previous frame data repetition module.

Type: Application

Filed: November 18, 2008

Publication date: March 19, 2009

Applicant: Huawei Technologies Co., Ltd.

Inventors: Jianfeng Xu, Lei Miao, Chen Hu, Qing Zhang, Lijing Xu, Wei Li, Zhengzhong Du, Yi Yang, Fengyan Qi, Wuzhou Zhan, Dongqi Wang
Signal modification method for efficient coding of speech signals

Publication number: 20090063139

Abstract: For determining a long-term-prediction delay parameter characterizing a long term prediction in a technique using signal modification for digitally encoding a sound signal, the sound signal is divided into a series of successive frames, a feature of the sound signal is located in a previous frame, a corresponding feature of the sound signal is located in a current frame, and the long-term-prediction delay parameter is determined for the current frame while mapping, with the long term prediction, the signal feature of the previous frame with the corresponding signal feature of the current frame. In a signal modification method for implementation into a technique for digitally encoding a sound signal, the sound signal is divided into a series of successive frames, each frame of the sound signal is partitioned into a plurality of signal segments, and at least a part of the signal segments of the frame are warped while constraining the warped signal segments inside the frame.

Type: Application

Filed: October 21, 2008

Publication date: March 5, 2009

Inventors: Mikko Tammi, Milan Jelinek, Claude LaFlamme, Vesa Ruoppila
WEARABLE DEVICE

Publication number: 20090058611

Abstract: A wearable device is worn by a person participating in an event in which a plurality of other people are participating and wearing other wearable devices. The wearable device includes a request unit for transmitting a request signal to other wearable devices that are in a predetermined range, and receiving a response to the request signal from each of the other wearable devices, and a communication unit for determining, with use of the received responses, one or more of the other wearable devices to be a communication partner, and performing data communication with the determined one or more other wearable devices. The data received in the communication is data collected by the one or more other wearable devices determined to be communication partners, and the data is used as a profile component when creating a profile of the event.

Type: Application

Filed: February 21, 2007

Publication date: March 5, 2009

Inventors: Takashi Kawamura, Masayuki Misaki, Ryouichi Kawanishi, Masaki Yamauchi
Method and System for Determining Predominant Fundamental Frequency

Publication number: 20090063138

Abstract: Methods, digital systems, and computer readable media are provided for determining a predominant fundamental frequency of a frame of an audio signal by finding a maximum absolute signal value in history data for the frame, determining a number of bits for downshifting based on the maximum absolute signal value, computing autocorrelations for the frame using signal values downshifted by the number of bits, and determining the predominant fundamental frequency using the computed autocorrelations.

Type: Application

Filed: August 4, 2008

Publication date: March 5, 2009

Inventors: Atsuhiro Sakurai, Steven David Trautmann
METHOD FOR GLOBALIZING SUPPORT OPERATIONS

Publication number: 20090063125

Abstract: Techniques are provided for globalizing handling of service management items. The techniques include obtaining a service management item in a language convenient to a first of two or more actors, translating the service management item into a language-neutral format to obtain a language-neutral service management item, applying one or more annotators to the service management item, translating the language-neutral service management item into a language convenient to a second of two or more actors acting on the service management item, and routing the translated service management item to the second of two or more actors. Techniques are also provided for generating a database of service management items in a language-neutral format.

Type: Application

Filed: August 28, 2007

Publication date: March 5, 2009

Applicant: International Business Machines Corporation

Inventors: Alexander Faisman, Genady Grabarnik, Jonathan Lenchner, Larisa Shwartz
Word Detection

Publication number: 20090055168

Abstract: Methods, systems, and apparatus, including computer program products, in which data from web documents are partitioned into a training corpus and a development corpus are provided. First word probabilities for words are determined for the training corpus, and second word probabilities for the words are determined for the development corpus. Uncertainty values based on the word probabilities for the training corpus and the development corpus are compared, and new words are identified based on the comparison.

Type: Application

Filed: August 23, 2007

Publication date: February 26, 2009

Applicant: GOOGLE INC.

Inventors: Jun Wu, Tang Xi Liu, Feng Hong, Yonggang Wang, Bo Yang, Lei Zhang
Pitch prediction for use by a speech decoder to conceal packet loss

Publication number: 20090043569

Abstract: There is provided a pitch lag predictor for use by a speech decoder to generate a predicted pitch lag parameter. The pitch lag predictor comprises a summation calculator configured to generate a first summation based on a plurality of previous pitch lag parameters, and a second summation based on a plurality of previous pitch lag parameters and a position of each of the plurality of previous pitch lag parameters with respect to the predicted pitch lag parameter; a coefficient calculator configured to generate a first coefficient using a first equation based on the first summation and the second summation, and a second coefficient using a second equation based on the first summation and the second summation, wherein the first equation is different than the second equation; and a predictor configured to generate the predicted pitch lag parameter based on the first coefficient and the second coefficient.

Type: Application

Filed: October 8, 2008

Publication date: February 12, 2009

Inventor: Yang Gao
Accent information extracting apparatus and method thereof

Publication number: 20090043568

Abstract: An accent type is determined by outputting mora synchronized signals, extracting a pitch pattern which is a variation pattern of a voice height (fundamental frequency) from a speech signal entered by a user, generating mora synchronized pattern from the pitch pattern and the mora synchronized signal, storing typical patterns for respective accent types, collating the mora synchronized pattern and reference accent pattern, calculating matching of the mora synchronized patterns with respect to the respective accent types, referring the matching and determining the accent type.

Type: Application

Filed: February 20, 2008

Publication date: February 12, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Takehiko Kagoshima
STORED VALUE GIFT CARD WITH VOICE RECORDING AND PLAYBACK

Publication number: 20090030694

Abstract: A gift card is provided which integrally combines a voice storage/playback unit with a stored value card and on two separate portions of a base substrate, with the two portions separated by a releasable connection portion. Upon purchasing the gift card, a gift giver records a personal voice message and provides the gift card to the recipient. Upon receiving the gift card, a recipient may immediately playback the recorded personal voice message through simple manipulation of the card. Thereafter, the two parts of the card may be separated by manipulation of the connection portion, and the stored value portion may be used in the matter of the conventional stored value cards. The voice storage/playback portion may be stored for safekeeping and played back by the gift recipient at will.

Type: Application

Filed: July 10, 2008

Publication date: January 29, 2009

Applicant: Voice Express Corporation

Inventor: Geoffrey S. Stern
AUDIO GUIDANCE SYSTEM

Publication number: 20090024394

Abstract: A CPU of a speech ECU acquires vehicle position information. If it is determined from the position information and map data stored in a memory that the vehicle has moved between areas where different languages are spoken as dialects or official languages, the CPU determines a language corresponding to the vehicle position information and transmits a request signal to a speech information center to transmit speech information in the language. By receiving the speech information from the speech information center, the CPU updates speech information pre-stored in the memory with the speech information transmitted from the speech information center.

Type: Application

Filed: June 30, 2008

Publication date: January 22, 2009

Applicant: DENSO CORPORATION

Inventors: Kazuhiro Nakashima, Toshio Shimomura, Kenichi Ogino, Kentaro Teshima
Method and apparatus for remote playback of personalized and non-personalized audio messages

Publication number: 20090018841

Abstract: A method and apparatus allow for remotely playing back recorded personalized and non-personalized audio messages to a listener. A first recorded message, which has been personalized for an intended listener, is stored in a first memory of an audio player. A second recorded message, which has not been personalized for an intended listener, is stored in a second memory of the audio player. Responsive to receiving a control command from a remote control device, the audio player plays the messages according to a predetermined arrangement, such as in a predetermined order or at predetermined intervals (e.g., the personalized message may be played a number of times before the non-personalized message is played). In one embodiment, the personalized message contains audio intended to be a calming and/or instructional influence on a small child. In another embodiment, the non-personalized message may include information associated with a sponsoring business.

Type: Application

Filed: July 8, 2008

Publication date: January 15, 2009

Inventors: Marshall T. Leeds, Susan M. Camacho, Steven C. Jacobs
Digital Stand Alone Device For Processing Handwritten Input

Publication number: 20090015567

Abstract: A standalone real time device to process handwritten text for further applications. The system includes a means for making visible markings on writing surface, accompanied with a motion detector for detecting the handwritten text. It also comprises a microprocessor for storing appropriate data and commands, an enhanced memory to provide storage space for information and data, and a power supply. The system further includes a display to provide visual feedback of processed data. Also, it further includes an audio reproduction device to provide audio feedback; and further includes wired or wireless communication means to transmit data to targeted devices via a transmission link in real time.

Type: Application

Filed: March 20, 2007

Publication date: January 15, 2009

Inventors: Marwa Abdelbaki, Firas Zeineddine
System For Giving Intelligibility Feedback To A Speaker

Publication number: 20090012794

Abstract: System for giving intelligibility feedback to a speaker (1), speaking for an audience (2), comprising a first microphone (3) at the speaker's side and a second microphone (4) at the audience's side. Both microphones are connected to processing means (5) which are arranged to compute an intelligibility value based on both microphones' signals. Signalling means (6), preferably at the side of the audience, are arranged to generate an intelligibility feedback signal depending on the calculated intelligibility value. The signalling means being arranged to generate said intelligibility feedback signal in an optical form, visible for the speaker concerned. Wireless connection means (19) may interconnect the microphones, the processing means and the signalling means.

Type: Application

Filed: February 8, 2007

Publication date: January 8, 2009

Applicant: Nerderlandse Organisatie voor toegepast- natuurwetenschappelijk Onderzoek TNO

Inventors: Sander Jeroen van Wijngaarden, Jan Adrianus Verhave
Speech signal decoding method and apparatus

Publication number: 20090012780

Abstract: In a speech signal decoding method, information containing at least a sound source signal, gain, and filter coefficients is decoded from a received bit stream. Voiced speech and unvoiced speech of a speech signal are identified using the decoded information. Smoothing processing based on the decoded information is performed for at least either one of the decoded gain and decoded filter coefficients in the unvoiced speech. The speech signal is decoded by driving a filter having the decoded filter coefficients by an excitation signal obtained by multiplying the decoded sound source signal by the decoded gain using the result of the smoothing processing. A speech signal decoding apparatus is also disclosed.

Type: Application

Filed: August 27, 2008

Publication date: January 8, 2009

Inventor: Atsushi Murashima
SYSTEM AND DIALOG MANAGER DEVELOPED USING MODULAR SPOKEN-DIALOG COMPONENTS

Publication number: 20080319763

Abstract: A dialog manager and spoken dialog service having a dialog manager generated according to a method comprising selecting a top level flow controller based on application type, selecting available reusable subdialogs for each application part, developing a subdialog for each application part not having an available subdialog and testing and deploying the spoken dialog service using the selected top level flow controller, selected reusable subdialogs and developed subdialogs. The dialog manager capable of handling context shifts in a spoken dialog with a user. Application dependencies are established in the top level flow controller thus enabling the subdialogs to be reusable and to be capable of managing context shifts and mixed initiative dialogs.

Type: Application

Filed: August 29, 2008

Publication date: December 25, 2008

Applicant: AT&T Corp.

Inventors: Giuseppe Di Fabbrizio, Charles Alfred Lewis
USING A WIKI EDITOR TO CREATE SPEECH-ENABLED APPLICATIONS

Publication number: 20080319762

Abstract: The present invention discloses a system and a method for creating and editing speech-enabled WIKIs. A WIKI editor can be served to client-side Web browsers so that end-users can utilize WIKI editor functions, which include functions to create and edit speech-enabled WIKI applications. A WIKI server can serve speech-enabled WIKI applications created via the WIKI editor. Each of the speech-enabled WIKI applications can include a link to at least one speech processing engine located in a speech processing system remote from the WIKI server. The speech processing engine can provide a speech processing capability for the speech-enabled WIKI application when served by the WIKI server. In one embodiment, the speech-enabled applications can include an introspection document, an entry collection of documents, and a resource collection of documents in accordance with standards specified by an ATOM PUBLISHING PROTOCOL (APP).

Type: Application

Filed: June 20, 2007

Publication date: December 25, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: WILLIAM V. DA PALMA, VICTOR S. MOORE, WENDI L. NUSBICKEL
SPEECH PROCESSING SYSTEM BASED UPON A REPRESENTATIONAL STATE TRANSFER (REST) ARCHITECTURE THAT USES WEB 2.0 CONCEPTS FOR SPEECH RESOURCE INTERFACES

Publication number: 20080319757

Abstract: A speech processing system can include a client, a speech for Web 2.0 system, and a speech processing system. The client can access a speech-enabled application using at least one Web 2.0 communication protocol. For example, a standard browser of the client can use a standard protocol to communicate with the speech-enabled application executing on the speech for Web 2.0 system. The speech for Web 2.0 system can access a data store within which user specific speech parameters are included, wherein a user of the client is able to configure the specific speech parameters of the data store. Suitable ones of these speech parameters are utilized whenever the user interacts with the Web 2.0 system. The speech processing system can include one or more speech processing engines. The speech processing system can interact with the speech for Web 2.0 system to handle speech processing tasks associated with the speech-enabled application.

Type: Application

Filed: June 20, 2007

Publication date: December 25, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: William V. Da Palma, Victor S. Moore, Wendi L. Nusbickel
SPEECH PROCESSING METHOD BASED UPON A REPRESENTATIONAL STATE TRANSFER (REST) ARCHITECTURE THAT USES WEB 2.0 CONCEPTS FOR SPEECH RESOURCE INTERFACES

Publication number: 20080319761

Abstract: The present invention discloses a method of performing speech processing operations based upon Web 2.0 type interfaces with speech engines. The method can include a step of interfacing with a Web 2.0 server from a standard browser. A speech-enabled application served by the Web 2.0 server can be accessed. The browser can render markup of the speech-enabled application. Speech input can be received from a user of the browser. A RESTful protocol, such as the ATOM Publishing Protocol (APP), can be utilized to access a remotely located speech engine. The speech engine can accept GET, PUT, POST, and DELETE commands. The speech processing engine can process the speech input and can provide results to the Web 2.0 server. The Web 2.0 server can perform a programmatic action based upon the provided results, which results in different content being presented in the browser.

Type: Application

Filed: June 20, 2007

Publication date: December 25, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: William V. Da Palma, Victor S. Moore, Wendi L. Nusbickel
SPEECH-ENABLED APPLICATION THAT USES WEB 2.0 CONCEPTS TO INTERFACE WITH SPEECH ENGINES

Publication number: 20080319758

Abstract: The present invention discloses a speech-enabled application that includes two or more linked markup documents that together form a speech-enabled application served by a Web 2.0 server. The linked markup documents can conform to an ATOM PUBLISHING PROTOCOL (APP) based protocol. Additionally, the linked markup documents can include an entry collection of documents and a resource collection of documents. The resource collection can include at least one speech resource associated with a speech engine disposed in a speech processing system remotely located from the Web 2.0 server. The speech resource can add a speech processing capability to the speech-enabled application. In one embodiment, end-users of the speech-enabled application can be permitted to introspect, customize, replace, add, re-order, and remove at least a portion of the linked markup documents.

Type: Application

Filed: June 20, 2007

Publication date: December 25, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: William V. Da Palma, Victor S. Moore, Wendi L. Nusbickel
INTERFACING AN APPLICATION SERVER TO REMOTE RESOURCES USING ENTERPRISE JAVA BEANS AS INTERFACE COMPONENTS

Publication number: 20080312933

Abstract: A method for interfacing an application server with a resource can include the step of associating a plurality of Enterprise Java Beans (EJBs) to a plurality of resources, where a one-to-one correspondence exists between EJB and resource. An application server can receive an application request and can determine a resource for handling the request. An EJB associated with the determined resource can interface the application server to the determined resource. The request can be handled with the determined resource.

Type: Application

Filed: August 28, 2008

Publication date: December 18, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Thomas E. Creamer, Victor S. Moore, Wendi L. Nusbickel, Ricardo Dos Santos, James J. Sliwa
APPARATUS AND METHOD FOR TRANSMITTING/RECEIVING VOICE DATA TO ESTIMATE VOICE DATA VALUE CORRESPONDING TO RESYNCHRONIZATION PERIOD

Publication number: 20080312936

Abstract: Provided are an apparatus and method for estimating a voice data value corresponding to a silent period produced in a key resynchronization process using the sine waveform characteristic of voice when encrypted digital voice data is transmitted in one-way wireless communication environment. The apparatus includes a transmitter that generates a key resynchronization frame containing key resynchronization information and vector information on voice data inserted thereinto and transmits the key resynchronization frame, and a receiver that receives the key resynchronization frame from the transmitter, extracts the vector information inserted in the key resynchronization frame, and estimates a voice data value corresponding to the key resynchronization period. Based on change ratio between slopes calculated using received voice data, it is possible to estimate the voice data corresponding to a silent period, which improves communication quality.

Type: Application

Filed: March 14, 2008

Publication date: December 18, 2008

Inventors: Taek Jun NAM, Byeong-Ho AHN, Seok RYU, Sang-Yi YI
SYSTEM AND METHOD OF USING MODULAR SPOKEN-DIALOG COMPONENTS

Publication number: 20080306743

Abstract: A system and method are disclosed for switching contexts within a spoken dialog between a user and a spoken dialog system. The spoken dialog system utilizes modular subdialogs that are invoked by at least one flow controller that is a finite state model and that associated with a dialog manager. The spoken dialog system includes a dialog manager with a flow controller and a reusable subdialog module. The method includes, while the spoken dialog is being controlled by the subdialog module that was invoked by the flow controller, receiving context-changing input associated with speech from a user that changes a dialog context and comparing the context-changing input to at least one context shift. And, if any of the context shifts are activated by the comparing step, then passing control of the spoken dialog to the flow controller with context shift message and destination state.

Type: Application

Filed: August 19, 2008

Publication date: December 11, 2008

Applicant: AT&T Corp.

Inventors: Giuseppe Di Fabbrizio, Charles Alfred Lewis
SOUND SOURCE SEPARATION SYSTEM

Publication number: 20080306739

Abstract: A system capable of separating sound source signals with high precision while improving a convergence rate and convergence precision. A process of updating a current separation matrix Wk to a next separation matrix Wk+1 such that a next value J(Wk+1) of a cost function is closer to a minimum value J(W0) than a current value J(Wk) is iteratively performed. An update amount ?Wk of the separation matrix is increased as the current value J(Wk) of the cost function is increased and is decreased as a current gradient ?J(Wk)/?W of the cost function is rapid. On the basis of input signals x from a plurality of microphones Mi and an optimal separation matrix W0, it is possible to separate sound source signals y(=W0·x) with high precision while improving a convergence rate and convergence precision.

Type: Application

Filed: June 5, 2008

Publication date: December 11, 2008

Applicant: HONDA MOTOR CO., LTD.

Inventors: Hirofumi Nakajima, Kazuhiro Nakadai, Yuji Hasegawa, Hiroshi Tsujino
METHODS AND APPARATUS RELATED TO CONTENT SHARING BETWEEN DEVICES

Publication number: 20080290987

Abstract: In one embodiment, a method includes receiving a signal from a communication device via a communication channel. The method also includes determining, based on the signal, a parameter value used for identification of a product of interest. A duration of the receiving is modified when a threshold condition is unsatisfied based on a probability value calculated based on the parameter value. The probability value is associated with identification of the product of interest.

Type: Application

Filed: April 22, 2008

Publication date: November 27, 2008

Inventor: Lehmann Li
Input/Output Apparatus Based on Voice Recognition, and Method Thereof

Publication number: 20080288260

Abstract: Provided is an input/output apparatus based on voice recognition, and a method thereof. An object of the apparatus is to improve a user interface by making pointing input and command execution such as application program control possible according to a voice command of a user possible based on a voice recognition technology without individual pointing input device such as a mouse and a touch pad, and a method thereof. The apparatus includes: a voice recognizer for recognizing a voice command inputted from outside; a pointing controller for calculating a pointing location on a screen which corresponds to a voice recognition result transmitted from the voice recognizer; a displayer for displaying a screen; and a command controller for processing diverse commands related to a current pointing location.

Type: Application

Filed: September 11, 2006

Publication date: November 20, 2008

Inventors: Kwan-Hyun Cho, Mun-Sung Han, Jun-Seok Park, Young-Giu Jung
Selection of preferential pitch value for speech processing

Publication number: 20080288246

Abstract: There is provided a method of using a processing circuitry for selecting a preferential pitch lag value from a plurality of pitch lag values, including a first pitch lag value and a second pitch lag value, for coding an input speech signal. The method comprises determining a first timing relationship between a previous pitch lag value and at least one of the plurality of pitch lag values; determining a second timing relationship between the first pitch lag value and the second pitch lag value; favoring one of the first pitch lag value and the second pitch lag value based on the first timing relationship and the second timing relationship to select one of the first pitch lag value and the second pitch lag value as the preferential pitch lag value; and converting the input speech signal into an encoded speech using the preferential pitch lag value.

Type: Application

Filed: July 23, 2008

Publication date: November 20, 2008

Applicants: Mindspeed Technologies, Inc.

Inventors: Huan-Yu Su, Yang Gao
REAL-TIME DETECTION AND PRESERVATION OF SPEECH ONSET IN A SIGNAL

Publication number: 20080281586

Abstract: A “speech onset detector” provides a variable length frame buffer in combination with either variable transmission rate or temporal speech compression for buffered signal frames. The variable length buffer buffers frames that are not clearly identified as either speech or non-speech frames during an initial analysis. Buffering of signal frames continues until a current frame is identified as either speech or non-speech. If the current frame is identified as non-speech, buffered frames are encoded as non-speech frames. However, if the current frame is identified as a speech frame, buffered frames are searched for the actual onset point of the speech. Once that onset point is identified, the signal is either transmitted in a burst, or a time-scale modification of the buffered signal is applied for compressing buffered frames beginning with the frame in which onset point is detected. The compressed frames are then encoded as one or more speech frames.

Type: Application

Filed: July 28, 2008

Publication date: November 13, 2008

Applicant: MICROSOFT CORPORATION

Inventors: Dinei A. Florencio, Philip A. Chou
PROCESSING AUDIO DATA

Publication number: 20080281599

Abstract: A method of processing audio data including obtaining (202) audio data; analysing (206) the audio data to determine at least one characteristic of the audio data; generating (206) data describing the at least one characteristic of the analysed audio data, and/or modifying (412) an audio recording process based on the at least one characteristic of the analysed audio data.

Type: Application

Filed: April 15, 2008

Publication date: November 13, 2008

Inventor: Paul Rocca
Method and system for pitch contour quantization in audio coding

Publication number: 20080275695

Abstract: A method and device for improving coding efficiency in audio coding. From the pitch values of a pitch contour of an audio signal, a plurality of simplified pitch contour segments are generated to approximate the pitch contour, based on one or more pre-selected criteria. The contour segments can be linear or non-linear with each contour segment represented by a first end point and a second end point. If the contour segments are linear, then only the information regarding the end points, instead of the pitch values, are provided to a decoder for reconstructing the audio signal. The contour segment can have a fixed maximum length or a variable length, but the deviation between a contour segment and the pitch values in that segment is limited by a maximum value.

Type: Application

Filed: April 25, 2008

Publication date: November 6, 2008

Inventors: Anssi Ramo, Jani Nurminen, Sakari Himanen, Ari Heikkinen
Audio Processing

Publication number: 20080275697

Abstract: An audio processing apparatus for processing two sampled audio signals to detect a temporal position of one of the audio signals with respect to the other. The apparatus detects audio power characteristics of each signal in respect of successive continuous temporal portions of each of the two signals, the portions having identical lengths and each portion including at least two audio samples, and correlates the detected audio power characteristics in respect of the two audio signals to establish a most likely temporal offset between the two audio signals.

Type: Application

Filed: October 27, 2006

Publication date: November 6, 2008

Applicant: SONY UNITED KINGDOM LIMITED

Inventors: William Edmund Cranstoun Kentish, Nicolas John Haynes

prev … 7 8 9 10 11 12 next