Patents Examined by Daniel Nolan
-
Patent number: 6832196Abstract: A method of dynamically formatting a speech menu construct can include a series of steps. A markup language document containing a reference to a server-side program can be provided. The server-side program can be programmed to dynamically format data using a voice-enabled markup language. A database can be accessed using the server-side program. The database can have a plurality of data items. Using the voice-enabled markup language, the selected data items can be formatted thereby creating speech menu items. The speech menu items can specify a speech menu construct resulting in a menu interface that is dynamically generated from data in data store, rather than being written by a programmer, and allows the user to “speak to the data.Type: GrantFiled: March 30, 2001Date of Patent: December 14, 2004Assignee: International Business Machines CorporationInventor: David E. Reich
-
Patent number: 6826527Abstract: A decoder for code excited LP encoded frames with both adaptive and fixed codebooks; erased frame concealment uses muted repetitive excitation, threshold-adapted bandwidth expanded repetitive synthesis filter, and jittered repetitive pitch lag.Type: GrantFiled: November 3, 2000Date of Patent: November 30, 2004Assignee: Texas Instruments IncorporatedInventor: Takahiro Unno
-
Patent number: 6820052Abstract: A low-bit-rate coding technique for unvoiced segments of speech includes the steps of extracting high-time-resolution energy coefficients from a frame of speech, quantizing the energy coefficients, generating a high-time-resolution energy envelope from the quantized energy coefficients, and reconstituting a residue signal by shaping a randomly generated noise vector with quantized values of the energy envelope. The energy envelope may be generated with a linear interpolation technique. A post-processing measure may be obtained and compared with a predefined threshold to determine whether the coding algorithm is performing adequately.Type: GrantFiled: July 17, 2002Date of Patent: November 16, 2004Assignee: Qualcomm IncorporatedInventors: Amitava Das, Sharath Manjunath
-
Patent number: 6804649Abstract: Voice synthesis with improved expressivity is obtained in a voice synthesiser of source-filter type by making use of a library of source sound categories in the source module. Each source sound category corresponds to a particular morphological category and is derived from analysis of real vocal sounds, by inverse filtering so as to subtract the effect of the vocal tract. The library may be parametrical, that is, the stored data corresponds not to the inverse-filtered sounds themselves but to synthesis coefficients for resynthesising the inverse-filtered sounds using any suitable re-synthesis technique, such as the phase vocoder technique. The coefficients are derived by Short Time Fourier Transform (STFT) analysis.Type: GrantFiled: June 1, 2001Date of Patent: October 12, 2004Assignee: Sony France S.A.Inventor: Eduardo Reck Miranda
-
Patent number: 6799161Abstract: A speech coding apparatus having a speech input unit for receiving input speech, a speech coding rate selector for selecting an appropriate speech coding rate according to the power of the input speech, a speech analyzer for processing the input speech to estimate a transfer function of the speaker's oral cavity, and a speech coding unit forming a synthesis filter based on the transfer function of the oral cavity. The speech coding unit also codes an excitation signal of the synthesis filter on the basis of an estimation result supplied by the speech analyzer. A gain suppressor interposed between the speech input unit and the speech coding unit suppresses the gain of a signal supplied from the speech input unit to the speech coding unit during an unvoiced period according to information from the speech coding rate selector.Type: GrantFiled: January 15, 2002Date of Patent: September 28, 2004Assignee: Oki Electric Industry Co., Ltd.Inventor: Atsushi Yokoyama
-
Patent number: 6785644Abstract: With respect to data having periodicity to be compressed, windows of the same size are set for every two sections according to an interval of peaks appearing substantially periodically and processing for sorting sample data alternately among the set windows of the same size is sequentially performed, whereby a frequency of data having periodicity is replaced with an approximately half frequency without damaging reproducibility to original data at all to make it possible to apply compression processing to data of the replaced low frequency. If this sorting processing is applied to compression processing having a characteristic that a compression ratio is not increased in a high-frequency region, it becomes possible to improve a compression ratio without damaging a quality of reproduced data by decompression at all.Type: GrantFiled: December 16, 2002Date of Patent: August 31, 2004Assignee: Yasue SakaiInventor: Yukio Koyanagi
-
Patent number: 6775654Abstract: A digital audio reproducing apparatus including a receiver receiving modulated data, a demodulator demodulating the modulated data received by the receiver, an audio decoder decoding, in a unit of a frame, digital audio information contained in the modulated data demodulated by the demodulator, and an audibility corrector for effecting audibility correction on failing digital audio information contained in a frame that failed to be decoded, when the audio decoder fails to decode the digital audio information.Type: GrantFiled: August 31, 1999Date of Patent: August 10, 2004Assignees: Fujitsu Limited, FFC LimitedInventors: Hideaki Yokoyama, Kazuhisa Matsushima, Hiroshi Okubo, Tadayoshi Katoh, Takashi Saito
-
Patent number: 6766298Abstract: A unified web-based voice messaging system provides voice application control between a web browser and an application server via an hypertext transport protocol (HTTP) connection on an Internet Protocol (IP) network. The web browser receives an HTML page from the application server having an XML element that defines data for an audio operation to be performed by an executable audio resource. The application server executes the voice-enabled web application by runtime execution of extensible markup language (XML) documents that define the voice-enabled web application to be executed. The application server, in response to receiving a user request from a user, accesses a selected XML page that defines at least a part of the voice application to be executed for the user. The application server then parses the XML page, and executes the operation describer by the XML page.Type: GrantFiled: January 11, 2000Date of Patent: July 20, 2004Assignee: Cisco Technology, Inc.Inventors: Lewis Dean Dodrill, Geetha Ravishankar, Satish Joshi, Keith M. Basil, Ryan Alan Danner, James Richard Grove, Jr., Steven J. Martin
-
Patent number: 6757649Abstract: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.Type: GrantFiled: April 8, 2003Date of Patent: June 29, 2004Assignee: Mindspeed Technologies Inc.Inventors: Yang Gao, Adil Benyassine, Jes Thyssen, Eyal Shlomot, Huan-yu Su
-
Patent number: 6757656Abstract: A method for concurrent presentation of multiple audio information sources. In the method, audio information from at least two audio information sources is concurrently presented, and a user speech selection of one of the audio information sources is accepted. At least one of the audio information sources can then be reconfigured. The reconfiguration audibly distinguishes the user selected audio information source from other audio information sources.Type: GrantFiled: June 15, 2000Date of Patent: June 29, 2004Assignee: International Business Machines CorporationInventors: Qing Gong, James R. Lewis, Ronald E. Vanbuskirk, Huifang Wang
-
Patent number: 6751589Abstract: A preferred method for generating a document includes the steps of: providing an applicant with a visual representation, via a visual display device, of at least a portion of a document; prompting an applicant to provide first information corresponding to a first portion of the document; receiving the first information, as a first vocal response, from the applicant; converting the first vocal response to corresponding first textual data; providing the applicant with an updated visual representation, via the visual display device, of the first textual data appearing at the first portion of the document; and generating a printed document corresponding to the updated visual representation of the document. Systems and computer readable media also are provided.Type: GrantFiled: September 18, 2000Date of Patent: June 15, 2004Assignee: Hewlett-Packard Development Company, L.P.Inventor: Gustavo M. Guillemin
-
Patent number: 6741964Abstract: When recording digital data corresponding to a voice signal, a voice data recording and reproducing apparatus generates an error correction code and records this code together with the digital data in semiconductor memory. When transferring the digital data to the PC, a system control section in the voice data recording and reproducing apparatus transmits voice data including the error correction code without performing error correction. The system control section provides a lower data processing capability than that of a PC's CPU. The PC's CPU having a higher data processing capability performs error correction of the voice data by using the error correction code included in the received voice data.Type: GrantFiled: January 8, 2001Date of Patent: May 25, 2004Assignee: Olympus Optical Co., Ltd.Inventor: Hideo Okano
-
Patent number: 6735567Abstract: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.Type: GrantFiled: April 8, 2003Date of Patent: May 11, 2004Assignee: Mindspeed Technologies, Inc.Inventors: Yang Gao, Adil Benyassine, Jes Thyssen, Eyal Shlomot, Huan-yu Su
-
Patent number: 6735564Abstract: A method and arrangement for managing talk groups of a telecommunication system at a dispatcher station of the telecommunications system having one or more talk groups which may consist of one or more users and which are controlled by the dispatcher at the dispatcher station. The arrangement includes a two-channel or a multichannel sound reproducing system which is configured to create an artificial acoustic space at the dispatcher station, and reproduce voices of each talk group so that the voices are heard from a certain point of the acoustic space, which allows the dispatcher to recognize the talk group to which the voice belongs on the basis of the location of the voice.Type: GrantFiled: December 28, 2000Date of Patent: May 11, 2004Assignee: Nokia Networks OyInventor: Pekka Puhakainen
-
Patent number: 6714909Abstract: The invention provides a system and method for automatically indexing and retrieving multimedia content. The method may include separating a multimedia data stream into audio, visual and text components, segmenting the audio, visual and text components based on semantic differences, identifying at least one target speaker using the audio and visual components, identifying a topic of the multimedia event using the segmented text and topic category models, generating a summary of the multimedia event based on the audio, visual and text components, the identified topic and the identified target speaker, and generating a multimedia description of the multimedia event based on the identified target speaker, the identified topic, and the generated summary.Type: GrantFiled: November 21, 2000Date of Patent: March 30, 2004Assignee: AT&T Corp.Inventors: David Crawford Gibbon, Qian Huang, Zhu Liu, Aaron Edward Rosenberg, Behzad Shahraray
-
Patent number: 6711545Abstract: A device for processing speech signal is disclosed comprising a speech signal input and speech signal digitizer, a control signal input and control signal digitizer, a digital signal processor, a transmitter to transmit processed digital signals in a cordless fashion to a base station, and a storage means, which is connected upstream of the transmitter and by which, depending on the quality of the communication link between the device and the base station, buffers the processed digital signals to be supplied and to be transmitted between the device and the base station.Type: GrantFiled: October 11, 2000Date of Patent: March 23, 2004Assignee: Koninklijke Philips Electronics N.V.Inventor: Manfred Hörndl
-
Patent number: 6708154Abstract: A model is provided for formants found in human speech. Under one aspect of the invention, the model is used to synthesize speech. Under this aspect of the invention, the formant model is used to identify a most likely formant track for the synthesized speech. Based on this track, a series of resonators are used to introduce the formants into the speech signal.Type: GrantFiled: November 14, 2002Date of Patent: March 16, 2004Assignee: Microsoft CorporationInventor: Alejandro Acero
-
Patent number: 6704703Abstract: The excitation in a CELP-like speech coder is recursively calculated. For a given bitrate and a given complexity, the recursive approach described lowers the complexity with minimum impact on speech quality. The excitation signal is a sum of at least three vector terms, each vector term being a product of a codebook vector zk and an associated gain term gk. A first vector term g0z0 is determined that is representative of a target excitation vector x. Each remaining vector term is recursively determined as a vector term gkzk representative of the difference between the target excitation vector x and the sum of previously determined vector terms, ∑ i = 0 k - 1 ⁢ g i ⁢ z i .Type: GrantFiled: February 2, 2001Date of Patent: March 9, 2004Assignee: ScanSoft, Inc.Inventors: Mohand Ferhaoul, Jean-Francois Rasaminjanahary, Stefaan Van Gerven, Abderrahman Essebbar
-
Patent number: 6704709Abstract: A system and method for improving the accuracy of a speech recognition program. The system is based on a speech recognition program that automatically converts a pre-recorded audio file into a written text. The system parses the written text into segments, each of which can be corrected by the system and saved in a retrievable manner in association with the computer. The standard speech files are saved towards improving accuracy in speech-to-text conversion by the speech recognition program. The system further includes facilities to repetitively establish an independent instance of the written text from the pre-recorded audio file using the speech recognition program. This independent instance can then be broken into segments and each erroneous segment in said independent instance replaced with the corrected segment associated with that segment. In this manner, repetitive instruction of a speech recognition program can be facilitated.Type: GrantFiled: July 26, 2000Date of Patent: March 9, 2004Assignee: Custom Speech USA, Inc.Inventors: Jonathan Kahn, Thomas P Flynn, Charles Qin, Nicholas A. Linden
-
Patent number: RE39336Abstract: The concatenative speech synthesizer employs demi-syllable subword units to generate speech. The synthesizer is based on a source-filter model that uses source signals that correspond closely to the human glottal source and that uses filter parameters that correspond closely to the human vocal tract. Concatenation of the demi-syllable units is facilitated by two separate cross face techniques, one applied in the time domain in the demi-syllable source signal waveforms, and one applied in the frequency domain by interpolating the corresponding filter parameters of the concatenated demi-syllables. The dual cross fade technique results in natural sounding synthesis that avoids time-domain glitches without degrading or smearing characteristic resonances in the filter domain.Type: GrantFiled: November 5, 2002Date of Patent: October 10, 2006Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Steve Pearson, Nicholas Kibre, Nancy Niedzielski