Miscellaneous Analysis Or Detection Of Speech Characteristics (epo) Patents (Class 704/E11.001)
  • Publication number: 20090216538
    Abstract: A computer implemented method facilitates a user interaction via a speech-based user interface. The method acquires spoken input from a user in a form of a phrase of one or more words. It further determines, using a plurality of different domains) whether the phrase is a query or a command. If the phrase is the query the method retrieves and presents relevant items from a plurality of databases. If the phrase is a command, the method performs an operation.
    Type: Application
    Filed: February 25, 2008
    Publication date: August 27, 2009
    Inventors: Garrett Weinberg, Bhiksha Ramakrishnan, Bent Schmidt-Nielsen, Bret A. Harsham
  • Publication number: 20090204395
    Abstract: A strained-rough-voice conversion unit (10) is included in a voice conversion device that can generate a “strained rough” voice produced in a part of a speech when speaking forcefully with excitement, nervousness, anger, or emphasis and thereby richly express vocal expression such as anger, excitement, or an animated or lively way of speaking, using voice quality change. The strained-rough-voice conversion unit (10) includes: a strained phoneme position designation unit (11) designating a phoneme to be uttered as a “strained rough” voice in a speech; and an amplitude modulation unit (14) performing modulation including periodic amplitude fluctuation on a speech waveform.
    Type: Application
    Filed: January 22, 2008
    Publication date: August 13, 2009
    Inventors: Yumiko Kato, Takahiro Kamai
  • Publication number: 20090199101
    Abstract: A system (20) for inputting graphical data into a graphical input field includes a graphical input device (22) for inputting the graphical data into the graphical input field, and a processor-executable voice-form module (28) responsive to an initial presentation of graphical data to the graphical input device. The voice-form module (28) causes a determination of whether the inputting of the graphical data into the graphical input field is complete. A method for inputting graphical data into a graphical input field includes initiating an input of graphical data via a graphical input device into the graphical input field, and actuating a voice-form module in response to initiating the input of graphical data into the graphical input field.
    Type: Application
    Filed: January 30, 2009
    Publication date: August 6, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Charles W. Cross, JR., David Jaramillo, Marc White
  • Publication number: 20090192798
    Abstract: A method for task execution improvement, the method includes: generating a baseline model for executing a task; recording a user executing a task; comparing the baseline model to the user's execution of the task; and providing feedback to the user based on the differences in the user's execution and the baseline model.
    Type: Application
    Filed: January 25, 2008
    Publication date: July 30, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sara H. Basson, Dimitri Kanevsky, Edward E. Kelley, Bhuvana Ramabhadran
  • Publication number: 20090192799
    Abstract: Speech enhancement in a breathing apparatus is provided using a primary sensor mounted near a breathing mask user's mouth, at least one reference sensor mounted near a noise source, and a processor that combines the signals from these sensors to produce an output signal with an enhanced speech component. The reference sensor signal may be filtered and the result may be subtracted from the primary sensor signal to produce the output signal with an enhanced speech component. A method for detecting the exclusive presence of a low air alarm noise may be used to determine when to update the filter. A triple filter adaptive noise cancellation method may provide improved performance through reduction of filter maladaptation. The speech enhancement techniques may be employed as part of a communication system or a speech recognition system.
    Type: Application
    Filed: January 29, 2008
    Publication date: July 30, 2009
    Applicant: Digital Voice Systems, Inc.
    Inventors: Daniel W. Griffin, John C. Hardwick
  • Publication number: 20090187399
    Abstract: The invention allows phonetic text input without any knowledge of phonetics. As an assist to the user of computer text entry systems, the invention makes possible an alternative method of Chinese character entry by entering a Chinese character assumed by the user to be a homophone of the character the user desires to enter. Entry methods for such homophone alternative entry include non-phonetic entry of Chinese characters using keyboard stroke input and single stroke, cursive and semi-cursive entry on an electronic surface. Direct correction of some misspellings of Chinese characters during phonetic entry also is made possible. The invention is not only helpful for entry of difficult Chinese characters but also provides an approach to the use of supplementing input methods for most if not all written languages.
    Type: Application
    Filed: January 22, 2008
    Publication date: July 23, 2009
    Inventor: Robert B. O'Dell
  • Publication number: 20090164220
    Abstract: A sound recording and playback apparatus and associated method, comprising: an audio storage medium; a microphone; a speaker; and a plurality of direct message access buttons, each direct message access button simultaneously associated both with a particular pre-recorded sound sequence stored in the storage medium, and with a particular new sound sequence capable of being recorded into the storage medium; wherein: when a particular direct message access button is depressed in a manner which respectively designates pre-recorded playback, new sound sequence recording, or new sound sequence playback, the particular pre-recorded sound sequence associated with the particular direct message access button is respectively audibly played over the speaker or recorded into the storage medium, as appropriate.
    Type: Application
    Filed: February 23, 2009
    Publication date: June 25, 2009
    Inventor: Howard M. Katz
  • Publication number: 20090144131
    Abstract: In an apparatus that creates and distributes voice advertisements to end users, the apparatus includes a voice advertising portal and a network coupled to the voice advertising portal. A voice application server determines which portions of a voice advertisement will be cached locally at the voice advertising portal for subsequent local retrieval during user interaction with the voice advertising portal.
    Type: Application
    Filed: July 28, 2008
    Publication date: June 4, 2009
    Inventors: Leo Chiu, Donald R. Steul, Arumugam Appadurai
  • Publication number: 20090138269
    Abstract: A method for enabling voice driven interactions among multiple interactive voice response (IVR) systems begins by receiving a telephone call from a user of a first IVR system to begin a transaction; and, automatically contacting, by the first IVR system, at least one additional IVR system. Specifically, the contacting of the additional IVR system includes assigning tasks to the additional IVR system. The tasks require input from the user and the additional IVR system is secure and separate from the first IVR system. Moreover, the tasks can include a transfer of currency and a transfer of local information.
    Type: Application
    Filed: November 28, 2007
    Publication date: May 28, 2009
    Inventors: Sheetal K. Agarwal, Dipanjan Chakraborty, Arun Kumar, Amit A. Nanavati, Nitendra Rajput
  • Publication number: 20090132256
    Abstract: A first communication path for receiving a communication is established. The communication includes speech, which is processed. A speech pattern is identified as including a voice-command. A portion of the speech pattern is determined as including the voice-command. That portion of the speech pattern is separated from the speech pattern and compared with a second speech pattern. If the two speech patterns match or resemble each other, the portion of the speech pattern is accepted as the voice-command. An operation corresponding to the voice-command is determined and performed. The operation may perform an operation on a remote device, forward the voice-command to a remote device, or notify a user. The operation may create a second communication path that may allow a headset to join in a communication between another headset and a communication device, several headsets to communicate with each other, or a headset to communicate with several communication devices.
    Type: Application
    Filed: January 25, 2008
    Publication date: May 21, 2009
    Applicant: Embarq Holdings Company, LLC
    Inventors: Erik Geldbach, Kelsyn D. Rooks, SR., Shane M. Smith, Mark Wilmoth
  • Publication number: 20090119092
    Abstract: A language package system that prevents undesirable behaviors resulting from an incompatibility between a core package of a software product and its language packages is provided. The language package system executes when a user starts the execution of the core package on a computing device. The language package system retrieves a language package version number from the core package that indicates the version number of compatible language packages and an indication of the preferred language of the user. The language package system then determines whether the computing device has a compatible language package that is available. When the computing device has a compatible language package, the software product uses that language package. When the computing device has no compatible language package, the language package system then performs processing that factors in the unavailability of a compatible language package.
    Type: Application
    Filed: November 1, 2007
    Publication date: May 7, 2009
    Applicant: Microsoft Corporation
    Inventors: Balaji Balasubramanyan, Dmitri Davydok
  • Publication number: 20090112578
    Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound text input. The device is able to assemble language objects in the memory to generate compound language solutions. The device is able to prioritize compound language solutions according to various criteria, including the degree of completeness of the text components of a compound language solution.
    Type: Application
    Filed: December 30, 2008
    Publication date: April 30, 2009
    Inventors: Vadim Fux, Michael G. Elizarov
  • Publication number: 20090112601
    Abstract: A human interface device for assisting the verbally challenged to record custom messages and play back the custom and pre-recorded messages through a sequence of simple finger movements. A data glove containing Hall Effect and Bend Resistor is warn by the user and connected to the Voice Module. The data glove is designed to capture and translate the sequence of finger movements into actions, then transmits the actions to the Voice Module. When a pause is sensed in the actions, the Voice Module links these actions to the custom or pre-recorded messages. These messages are then played on the Voice Module allowing people in close proximity to hear. A Remote Voice Monitor may also be wirelessly connected to the Voice Module to allow remote monitoring.
    Type: Application
    Filed: October 25, 2007
    Publication date: April 30, 2009
    Inventor: Larry Don Fullmer
  • Publication number: 20090099848
    Abstract: The present invention is an innovative system and method for passive diagnosis of dementias. The disclosed invention enables early diagnosis of and assessments of the efficacy of medications for neural disorders which are characterized by progressive linguistic decline and circadian speech-rhythm disturbances. Clinical and psychometric indicators of dementias are automatically identified by longitudinal statistical measurements and track the nature of language change and/or patient audio features change using mathematical methods. According to embodiments of the present invention the disclosed system and method include multi-layer processing units wherein initial processing of the recorded audio data is processed in a local unit. Processed and required raw data is also transferred to a central unit which performs in-depth analysis of the audio data.
    Type: Application
    Filed: October 16, 2007
    Publication date: April 16, 2009
    Inventors: Moshe Lerner, Ofer Bahar
  • Publication number: 20090089066
    Abstract: A system and method for automatic user training in speech-to-speech translation includes integrating an automatic user response system configured to be responsive to a plurality of training items and selecting a training item from the plurality of training items. For the selected training item, in response to an utterance in a first language, the utterance is translated into a second language, and a response to the utterance in the second language is generated. A simulated action corresponding with the response in accordance with a user speaking the second language is also generated. The response and simulated action are output as a learning exercise for learning operations of the automatic user response system.
    Type: Application
    Filed: October 2, 2007
    Publication date: April 2, 2009
    Inventors: YUQING GAO, Liang Gu, Wei Zhang
  • Publication number: 20090089062
    Abstract: A public speaking self-evaluation tool that helps a user practice public speaking in terms of avoiding undesirable words or sounds, maintaining a desirable speech rhythm, and ensuring that the user is regularly glancing at the audience. The system provides a user interface through which the user is able to define the undesirable words or sounds that are to be avoided, as well as a maximum frequency of occurrence threshold to be used for providing warning signals based on detection of such filler or undesirable words or sounds. The user interface allows a user to define a speech rhythm, e.g. in terms of spoken syllables per minute, that is another maximum threshold for providing a visual warning indication. The disclosed system also provides a visual indication when the user fails to glance at the audience at least as often as defined by a predefined minimum threshold.
    Type: Application
    Filed: October 1, 2007
    Publication date: April 2, 2009
    Inventor: Fang Lu
  • Publication number: 20090089050
    Abstract: A device and a method for frame lost concealment are disclosed. A pitch period of a current lost frame is obtained on the basis of a pitch period of the last good frame before the current lost frame. An excitation signal of the current lost frame is recovered on the basis of the pitch period of the current lost frame and an excitation signal of the last good frame before the lost frame. Thereby, the hearing contrast of a receiver is reduced, and the quality of speech is improved. Further, in the present invention, a pitch period of continual lost frames is adjusted on the basis of the change trend of the pitch period of the last good frame before the lost frame. Therefore, a buzz effect produced by the continual lost frames is avoided, and the quality of speech is further improved.
    Type: Application
    Filed: December 8, 2008
    Publication date: April 2, 2009
    Applicant: Huawei Technologies Co., Ltd.
    Inventors: Yunneng Mo, Yulong Li, Fanrong Tang
  • Publication number: 20090083029
    Abstract: A word coinciding with a key word input by speech and a word related to the word are set as retrieval candidate words based on a word dictionary in which words representing formal names and aliases of the formal names are registered in association with a family attribute indicating a familiar relation among the words. Content related to any one of retrieval words selected out of the retrieval candidate words and a word related to the retrieval word is retrieved.
    Type: Application
    Filed: February 29, 2008
    Publication date: March 26, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Miwako Doi, Kaoru Suzuki, Toshiyuki Koga, Koichi Yamamoto
  • Publication number: 20090076827
    Abstract: A system for controlling or operating a plurality of target systems via spoken commands is provided. The system includes a first plurality of target systems, a second plurality of controllers for controlling or operating target systems via spoken commands, a speech recognition system that stores interface information that is specific to a target system or a group of target systems that are to be controlled or operated. A first controller in the second plurality of controllers includes a microphone for picking up audible signals in the vicinity of the first controller and a device for transmitting the audible signals to a speech recognition system. The speech recognition system is operable to analyze the interface information to recognize spoken commands issued for controlling or operating said target system.
    Type: Application
    Filed: September 11, 2008
    Publication date: March 19, 2009
    Inventors: Clemens Bulitta, Robert Kagermeier, Dietmar Sierk
  • Publication number: 20090076806
    Abstract: A sound processor including a microphone (1), a pre-amplifier (2), a bank of N parallel filters (3), means for detecting short-duration transitions in the envelope signal of each filter channel, and means for applying gain to the outputs of these filter channels in which the gain is related to a function of the second-order derivative of the slow-varying envelope signal in each filter channel, to assist in perception of low-intensity sort-duration speech features in said signal.
    Type: Application
    Filed: October 28, 2008
    Publication date: March 19, 2009
    Inventors: Andrew E. Vandali, Graeme M. Clark
  • Publication number: 20090076808
    Abstract: A method for performing a frame erasure concealment for a higher-band signal involves calculating a periodic intensity of the higher-band signal with respect to pitch period information of a lower-band signal; comparing the periodic intensity to a preconfigured threshold and, if the periodic intensity is greater or equal to the preconfigured threshold, performing the frame erasure concealment with a pitch period repetition based method. If the periodic intensity is less than the preconfigured threshold, performing the frame erasure concealment with a previous frame data repetition based method. A device for performing a frame erasure concealment includes a periodic intensity calculation module, a pitch period repetition module, and a previous frame data repetition module.
    Type: Application
    Filed: November 18, 2008
    Publication date: March 19, 2009
    Applicant: Huawei Technologies Co., Ltd.
    Inventors: Jianfeng Xu, Lei Miao, Chen Hu, Qing Zhang, Lijing Xu, Wei Li, Zhengzhong Du, Yi Yang, Fengyan Qi, Wuzhou Zhan, Dongqi Wang
  • Publication number: 20090063139
    Abstract: For determining a long-term-prediction delay parameter characterizing a long term prediction in a technique using signal modification for digitally encoding a sound signal, the sound signal is divided into a series of successive frames, a feature of the sound signal is located in a previous frame, a corresponding feature of the sound signal is located in a current frame, and the long-term-prediction delay parameter is determined for the current frame while mapping, with the long term prediction, the signal feature of the previous frame with the corresponding signal feature of the current frame. In a signal modification method for implementation into a technique for digitally encoding a sound signal, the sound signal is divided into a series of successive frames, each frame of the sound signal is partitioned into a plurality of signal segments, and at least a part of the signal segments of the frame are warped while constraining the warped signal segments inside the frame.
    Type: Application
    Filed: October 21, 2008
    Publication date: March 5, 2009
    Inventors: Mikko Tammi, Milan Jelinek, Claude LaFlamme, Vesa Ruoppila
  • Publication number: 20090058611
    Abstract: A wearable device is worn by a person participating in an event in which a plurality of other people are participating and wearing other wearable devices. The wearable device includes a request unit for transmitting a request signal to other wearable devices that are in a predetermined range, and receiving a response to the request signal from each of the other wearable devices, and a communication unit for determining, with use of the received responses, one or more of the other wearable devices to be a communication partner, and performing data communication with the determined one or more other wearable devices. The data received in the communication is data collected by the one or more other wearable devices determined to be communication partners, and the data is used as a profile component when creating a profile of the event.
    Type: Application
    Filed: February 21, 2007
    Publication date: March 5, 2009
    Inventors: Takashi Kawamura, Masayuki Misaki, Ryouichi Kawanishi, Masaki Yamauchi
  • Publication number: 20090063138
    Abstract: Methods, digital systems, and computer readable media are provided for determining a predominant fundamental frequency of a frame of an audio signal by finding a maximum absolute signal value in history data for the frame, determining a number of bits for downshifting based on the maximum absolute signal value, computing autocorrelations for the frame using signal values downshifted by the number of bits, and determining the predominant fundamental frequency using the computed autocorrelations.
    Type: Application
    Filed: August 4, 2008
    Publication date: March 5, 2009
    Inventors: Atsuhiro Sakurai, Steven David Trautmann
  • Publication number: 20090063125
    Abstract: Techniques are provided for globalizing handling of service management items. The techniques include obtaining a service management item in a language convenient to a first of two or more actors, translating the service management item into a language-neutral format to obtain a language-neutral service management item, applying one or more annotators to the service management item, translating the language-neutral service management item into a language convenient to a second of two or more actors acting on the service management item, and routing the translated service management item to the second of two or more actors. Techniques are also provided for generating a database of service management items in a language-neutral format.
    Type: Application
    Filed: August 28, 2007
    Publication date: March 5, 2009
    Applicant: International Business Machines Corporation
    Inventors: Alexander Faisman, Genady Grabarnik, Jonathan Lenchner, Larisa Shwartz
  • Publication number: 20090055168
    Abstract: Methods, systems, and apparatus, including computer program products, in which data from web documents are partitioned into a training corpus and a development corpus are provided. First word probabilities for words are determined for the training corpus, and second word probabilities for the words are determined for the development corpus. Uncertainty values based on the word probabilities for the training corpus and the development corpus are compared, and new words are identified based on the comparison.
    Type: Application
    Filed: August 23, 2007
    Publication date: February 26, 2009
    Applicant: GOOGLE INC.
    Inventors: Jun Wu, Tang Xi Liu, Feng Hong, Yonggang Wang, Bo Yang, Lei Zhang
  • Publication number: 20090043569
    Abstract: There is provided a pitch lag predictor for use by a speech decoder to generate a predicted pitch lag parameter. The pitch lag predictor comprises a summation calculator configured to generate a first summation based on a plurality of previous pitch lag parameters, and a second summation based on a plurality of previous pitch lag parameters and a position of each of the plurality of previous pitch lag parameters with respect to the predicted pitch lag parameter; a coefficient calculator configured to generate a first coefficient using a first equation based on the first summation and the second summation, and a second coefficient using a second equation based on the first summation and the second summation, wherein the first equation is different than the second equation; and a predictor configured to generate the predicted pitch lag parameter based on the first coefficient and the second coefficient.
    Type: Application
    Filed: October 8, 2008
    Publication date: February 12, 2009
    Inventor: Yang Gao
  • Publication number: 20090043568
    Abstract: An accent type is determined by outputting mora synchronized signals, extracting a pitch pattern which is a variation pattern of a voice height (fundamental frequency) from a speech signal entered by a user, generating mora synchronized pattern from the pitch pattern and the mora synchronized signal, storing typical patterns for respective accent types, collating the mora synchronized pattern and reference accent pattern, calculating matching of the mora synchronized patterns with respect to the respective accent types, referring the matching and determining the accent type.
    Type: Application
    Filed: February 20, 2008
    Publication date: February 12, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventor: Takehiko Kagoshima
  • Publication number: 20090030694
    Abstract: A gift card is provided which integrally combines a voice storage/playback unit with a stored value card and on two separate portions of a base substrate, with the two portions separated by a releasable connection portion. Upon purchasing the gift card, a gift giver records a personal voice message and provides the gift card to the recipient. Upon receiving the gift card, a recipient may immediately playback the recorded personal voice message through simple manipulation of the card. Thereafter, the two parts of the card may be separated by manipulation of the connection portion, and the stored value portion may be used in the matter of the conventional stored value cards. The voice storage/playback portion may be stored for safekeeping and played back by the gift recipient at will.
    Type: Application
    Filed: July 10, 2008
    Publication date: January 29, 2009
    Applicant: Voice Express Corporation
    Inventor: Geoffrey S. Stern
  • Publication number: 20090024394
    Abstract: A CPU of a speech ECU acquires vehicle position information. If it is determined from the position information and map data stored in a memory that the vehicle has moved between areas where different languages are spoken as dialects or official languages, the CPU determines a language corresponding to the vehicle position information and transmits a request signal to a speech information center to transmit speech information in the language. By receiving the speech information from the speech information center, the CPU updates speech information pre-stored in the memory with the speech information transmitted from the speech information center.
    Type: Application
    Filed: June 30, 2008
    Publication date: January 22, 2009
    Applicant: DENSO CORPORATION
    Inventors: Kazuhiro Nakashima, Toshio Shimomura, Kenichi Ogino, Kentaro Teshima
  • Publication number: 20090018841
    Abstract: A method and apparatus allow for remotely playing back recorded personalized and non-personalized audio messages to a listener. A first recorded message, which has been personalized for an intended listener, is stored in a first memory of an audio player. A second recorded message, which has not been personalized for an intended listener, is stored in a second memory of the audio player. Responsive to receiving a control command from a remote control device, the audio player plays the messages according to a predetermined arrangement, such as in a predetermined order or at predetermined intervals (e.g., the personalized message may be played a number of times before the non-personalized message is played). In one embodiment, the personalized message contains audio intended to be a calming and/or instructional influence on a small child. In another embodiment, the non-personalized message may include information associated with a sponsoring business.
    Type: Application
    Filed: July 8, 2008
    Publication date: January 15, 2009
    Inventors: Marshall T. Leeds, Susan M. Camacho, Steven C. Jacobs
  • Publication number: 20090015567
    Abstract: A standalone real time device to process handwritten text for further applications. The system includes a means for making visible markings on writing surface, accompanied with a motion detector for detecting the handwritten text. It also comprises a microprocessor for storing appropriate data and commands, an enhanced memory to provide storage space for information and data, and a power supply. The system further includes a display to provide visual feedback of processed data. Also, it further includes an audio reproduction device to provide audio feedback; and further includes wired or wireless communication means to transmit data to targeted devices via a transmission link in real time.
    Type: Application
    Filed: March 20, 2007
    Publication date: January 15, 2009
    Inventors: Marwa Abdelbaki, Firas Zeineddine
  • Publication number: 20090012794
    Abstract: System for giving intelligibility feedback to a speaker (1), speaking for an audience (2), comprising a first microphone (3) at the speaker's side and a second microphone (4) at the audience's side. Both microphones are connected to processing means (5) which are arranged to compute an intelligibility value based on both microphones' signals. Signalling means (6), preferably at the side of the audience, are arranged to generate an intelligibility feedback signal depending on the calculated intelligibility value. The signalling means being arranged to generate said intelligibility feedback signal in an optical form, visible for the speaker concerned. Wireless connection means (19) may interconnect the microphones, the processing means and the signalling means.
    Type: Application
    Filed: February 8, 2007
    Publication date: January 8, 2009
    Applicant: Nerderlandse Organisatie voor toegepast- natuurwetenschappelijk Onderzoek TNO
    Inventors: Sander Jeroen van Wijngaarden, Jan Adrianus Verhave
  • Publication number: 20090012780
    Abstract: In a speech signal decoding method, information containing at least a sound source signal, gain, and filter coefficients is decoded from a received bit stream. Voiced speech and unvoiced speech of a speech signal are identified using the decoded information. Smoothing processing based on the decoded information is performed for at least either one of the decoded gain and decoded filter coefficients in the unvoiced speech. The speech signal is decoded by driving a filter having the decoded filter coefficients by an excitation signal obtained by multiplying the decoded sound source signal by the decoded gain using the result of the smoothing processing. A speech signal decoding apparatus is also disclosed.
    Type: Application
    Filed: August 27, 2008
    Publication date: January 8, 2009
    Inventor: Atsushi Murashima
  • Publication number: 20080319763
    Abstract: A dialog manager and spoken dialog service having a dialog manager generated according to a method comprising selecting a top level flow controller based on application type, selecting available reusable subdialogs for each application part, developing a subdialog for each application part not having an available subdialog and testing and deploying the spoken dialog service using the selected top level flow controller, selected reusable subdialogs and developed subdialogs. The dialog manager capable of handling context shifts in a spoken dialog with a user. Application dependencies are established in the top level flow controller thus enabling the subdialogs to be reusable and to be capable of managing context shifts and mixed initiative dialogs.
    Type: Application
    Filed: August 29, 2008
    Publication date: December 25, 2008
    Applicant: AT&T Corp.
    Inventors: Giuseppe Di Fabbrizio, Charles Alfred Lewis
  • Publication number: 20080319762
    Abstract: The present invention discloses a system and a method for creating and editing speech-enabled WIKIs. A WIKI editor can be served to client-side Web browsers so that end-users can utilize WIKI editor functions, which include functions to create and edit speech-enabled WIKI applications. A WIKI server can serve speech-enabled WIKI applications created via the WIKI editor. Each of the speech-enabled WIKI applications can include a link to at least one speech processing engine located in a speech processing system remote from the WIKI server. The speech processing engine can provide a speech processing capability for the speech-enabled WIKI application when served by the WIKI server. In one embodiment, the speech-enabled applications can include an introspection document, an entry collection of documents, and a resource collection of documents in accordance with standards specified by an ATOM PUBLISHING PROTOCOL (APP).
    Type: Application
    Filed: June 20, 2007
    Publication date: December 25, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: WILLIAM V. DA PALMA, VICTOR S. MOORE, WENDI L. NUSBICKEL
  • Publication number: 20080319757
    Abstract: A speech processing system can include a client, a speech for Web 2.0 system, and a speech processing system. The client can access a speech-enabled application using at least one Web 2.0 communication protocol. For example, a standard browser of the client can use a standard protocol to communicate with the speech-enabled application executing on the speech for Web 2.0 system. The speech for Web 2.0 system can access a data store within which user specific speech parameters are included, wherein a user of the client is able to configure the specific speech parameters of the data store. Suitable ones of these speech parameters are utilized whenever the user interacts with the Web 2.0 system. The speech processing system can include one or more speech processing engines. The speech processing system can interact with the speech for Web 2.0 system to handle speech processing tasks associated with the speech-enabled application.
    Type: Application
    Filed: June 20, 2007
    Publication date: December 25, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: William V. Da Palma, Victor S. Moore, Wendi L. Nusbickel
  • Publication number: 20080319761
    Abstract: The present invention discloses a method of performing speech processing operations based upon Web 2.0 type interfaces with speech engines. The method can include a step of interfacing with a Web 2.0 server from a standard browser. A speech-enabled application served by the Web 2.0 server can be accessed. The browser can render markup of the speech-enabled application. Speech input can be received from a user of the browser. A RESTful protocol, such as the ATOM Publishing Protocol (APP), can be utilized to access a remotely located speech engine. The speech engine can accept GET, PUT, POST, and DELETE commands. The speech processing engine can process the speech input and can provide results to the Web 2.0 server. The Web 2.0 server can perform a programmatic action based upon the provided results, which results in different content being presented in the browser.
    Type: Application
    Filed: June 20, 2007
    Publication date: December 25, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: William V. Da Palma, Victor S. Moore, Wendi L. Nusbickel
  • Publication number: 20080319758
    Abstract: The present invention discloses a speech-enabled application that includes two or more linked markup documents that together form a speech-enabled application served by a Web 2.0 server. The linked markup documents can conform to an ATOM PUBLISHING PROTOCOL (APP) based protocol. Additionally, the linked markup documents can include an entry collection of documents and a resource collection of documents. The resource collection can include at least one speech resource associated with a speech engine disposed in a speech processing system remotely located from the Web 2.0 server. The speech resource can add a speech processing capability to the speech-enabled application. In one embodiment, end-users of the speech-enabled application can be permitted to introspect, customize, replace, add, re-order, and remove at least a portion of the linked markup documents.
    Type: Application
    Filed: June 20, 2007
    Publication date: December 25, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: William V. Da Palma, Victor S. Moore, Wendi L. Nusbickel
  • Publication number: 20080312933
    Abstract: A method for interfacing an application server with a resource can include the step of associating a plurality of Enterprise Java Beans (EJBs) to a plurality of resources, where a one-to-one correspondence exists between EJB and resource. An application server can receive an application request and can determine a resource for handling the request. An EJB associated with the determined resource can interface the application server to the determined resource. The request can be handled with the determined resource.
    Type: Application
    Filed: August 28, 2008
    Publication date: December 18, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Thomas E. Creamer, Victor S. Moore, Wendi L. Nusbickel, Ricardo Dos Santos, James J. Sliwa
  • Publication number: 20080312936
    Abstract: Provided are an apparatus and method for estimating a voice data value corresponding to a silent period produced in a key resynchronization process using the sine waveform characteristic of voice when encrypted digital voice data is transmitted in one-way wireless communication environment. The apparatus includes a transmitter that generates a key resynchronization frame containing key resynchronization information and vector information on voice data inserted thereinto and transmits the key resynchronization frame, and a receiver that receives the key resynchronization frame from the transmitter, extracts the vector information inserted in the key resynchronization frame, and estimates a voice data value corresponding to the key resynchronization period. Based on change ratio between slopes calculated using received voice data, it is possible to estimate the voice data corresponding to a silent period, which improves communication quality.
    Type: Application
    Filed: March 14, 2008
    Publication date: December 18, 2008
    Inventors: Taek Jun NAM, Byeong-Ho AHN, Seok RYU, Sang-Yi YI
  • Publication number: 20080306743
    Abstract: A system and method are disclosed for switching contexts within a spoken dialog between a user and a spoken dialog system. The spoken dialog system utilizes modular subdialogs that are invoked by at least one flow controller that is a finite state model and that associated with a dialog manager. The spoken dialog system includes a dialog manager with a flow controller and a reusable subdialog module. The method includes, while the spoken dialog is being controlled by the subdialog module that was invoked by the flow controller, receiving context-changing input associated with speech from a user that changes a dialog context and comparing the context-changing input to at least one context shift. And, if any of the context shifts are activated by the comparing step, then passing control of the spoken dialog to the flow controller with context shift message and destination state.
    Type: Application
    Filed: August 19, 2008
    Publication date: December 11, 2008
    Applicant: AT&T Corp.
    Inventors: Giuseppe Di Fabbrizio, Charles Alfred Lewis
  • Publication number: 20080306739
    Abstract: A system capable of separating sound source signals with high precision while improving a convergence rate and convergence precision. A process of updating a current separation matrix Wk to a next separation matrix Wk+1 such that a next value J(Wk+1) of a cost function is closer to a minimum value J(W0) than a current value J(Wk) is iteratively performed. An update amount ?Wk of the separation matrix is increased as the current value J(Wk) of the cost function is increased and is decreased as a current gradient ?J(Wk)/?W of the cost function is rapid. On the basis of input signals x from a plurality of microphones Mi and an optimal separation matrix W0, it is possible to separate sound source signals y(=W0·x) with high precision while improving a convergence rate and convergence precision.
    Type: Application
    Filed: June 5, 2008
    Publication date: December 11, 2008
    Applicant: HONDA MOTOR CO., LTD.
    Inventors: Hirofumi Nakajima, Kazuhiro Nakadai, Yuji Hasegawa, Hiroshi Tsujino
  • Publication number: 20080290987
    Abstract: In one embodiment, a method includes receiving a signal from a communication device via a communication channel. The method also includes determining, based on the signal, a parameter value used for identification of a product of interest. A duration of the receiving is modified when a threshold condition is unsatisfied based on a probability value calculated based on the parameter value. The probability value is associated with identification of the product of interest.
    Type: Application
    Filed: April 22, 2008
    Publication date: November 27, 2008
    Inventor: Lehmann Li
  • Publication number: 20080288260
    Abstract: Provided is an input/output apparatus based on voice recognition, and a method thereof. An object of the apparatus is to improve a user interface by making pointing input and command execution such as application program control possible according to a voice command of a user possible based on a voice recognition technology without individual pointing input device such as a mouse and a touch pad, and a method thereof. The apparatus includes: a voice recognizer for recognizing a voice command inputted from outside; a pointing controller for calculating a pointing location on a screen which corresponds to a voice recognition result transmitted from the voice recognizer; a displayer for displaying a screen; and a command controller for processing diverse commands related to a current pointing location.
    Type: Application
    Filed: September 11, 2006
    Publication date: November 20, 2008
    Inventors: Kwan-Hyun Cho, Mun-Sung Han, Jun-Seok Park, Young-Giu Jung
  • Publication number: 20080288246
    Abstract: There is provided a method of using a processing circuitry for selecting a preferential pitch lag value from a plurality of pitch lag values, including a first pitch lag value and a second pitch lag value, for coding an input speech signal. The method comprises determining a first timing relationship between a previous pitch lag value and at least one of the plurality of pitch lag values; determining a second timing relationship between the first pitch lag value and the second pitch lag value; favoring one of the first pitch lag value and the second pitch lag value based on the first timing relationship and the second timing relationship to select one of the first pitch lag value and the second pitch lag value as the preferential pitch lag value; and converting the input speech signal into an encoded speech using the preferential pitch lag value.
    Type: Application
    Filed: July 23, 2008
    Publication date: November 20, 2008
    Applicants: Mindspeed Technologies, Inc.
    Inventors: Huan-Yu Su, Yang Gao
  • Publication number: 20080281586
    Abstract: A “speech onset detector” provides a variable length frame buffer in combination with either variable transmission rate or temporal speech compression for buffered signal frames. The variable length buffer buffers frames that are not clearly identified as either speech or non-speech frames during an initial analysis. Buffering of signal frames continues until a current frame is identified as either speech or non-speech. If the current frame is identified as non-speech, buffered frames are encoded as non-speech frames. However, if the current frame is identified as a speech frame, buffered frames are searched for the actual onset point of the speech. Once that onset point is identified, the signal is either transmitted in a burst, or a time-scale modification of the buffered signal is applied for compressing buffered frames beginning with the frame in which onset point is detected. The compressed frames are then encoded as one or more speech frames.
    Type: Application
    Filed: July 28, 2008
    Publication date: November 13, 2008
    Applicant: MICROSOFT CORPORATION
    Inventors: Dinei A. Florencio, Philip A. Chou
  • Publication number: 20080281599
    Abstract: A method of processing audio data including obtaining (202) audio data; analysing (206) the audio data to determine at least one characteristic of the audio data; generating (206) data describing the at least one characteristic of the analysed audio data, and/or modifying (412) an audio recording process based on the at least one characteristic of the analysed audio data.
    Type: Application
    Filed: April 15, 2008
    Publication date: November 13, 2008
    Inventor: Paul Rocca
  • Publication number: 20080275695
    Abstract: A method and device for improving coding efficiency in audio coding. From the pitch values of a pitch contour of an audio signal, a plurality of simplified pitch contour segments are generated to approximate the pitch contour, based on one or more pre-selected criteria. The contour segments can be linear or non-linear with each contour segment represented by a first end point and a second end point. If the contour segments are linear, then only the information regarding the end points, instead of the pitch values, are provided to a decoder for reconstructing the audio signal. The contour segment can have a fixed maximum length or a variable length, but the deviation between a contour segment and the pitch values in that segment is limited by a maximum value.
    Type: Application
    Filed: April 25, 2008
    Publication date: November 6, 2008
    Inventors: Anssi Ramo, Jani Nurminen, Sakari Himanen, Ari Heikkinen
  • Publication number: 20080275697
    Abstract: An audio processing apparatus for processing two sampled audio signals to detect a temporal position of one of the audio signals with respect to the other. The apparatus detects audio power characteristics of each signal in respect of successive continuous temporal portions of each of the two signals, the portions having identical lengths and each portion including at least two audio samples, and correlates the detected audio power characteristics in respect of the two audio signals to establish a most likely temporal offset between the two audio signals.
    Type: Application
    Filed: October 27, 2006
    Publication date: November 6, 2008
    Applicant: SONY UNITED KINGDOM LIMITED
    Inventors: William Edmund Cranstoun Kentish, Nicolas John Haynes