Patents Examined by T{overscore (a)}livaldis Ivars {haeck over (S)}mits

Proofreading with text to speech feedback

Patent number: 6490563

Abstract: A computer implemented system and method of proofreading text in a computer system includes receiving text from a user into a text editing module. At least a portion of the text is converted to an audio signal upon the detection of an indicator, the indicator defining a boundary in the text by either being embodied therein or comprising delays in receiving text. The audio signal is played through a speaker to the user to provide feedback.

Type: Grant

Filed: August 17, 1998

Date of Patent: December 3, 2002

Assignee: Microsoft Corporation

Inventors: Hsiao-Wuen Hon, Dong Li, Xuedong Huang, Yun-Chen Ju, Xianghui Sean Zhang
Enhanced quantization method for spectral frequency coding

Patent number: 6487527

Abstract: A linear predictive speech encoding method combines vector quantization with the search for roots of LSP polynomials. At Under this method, a code book searchable using line spectral pair (LSP) values is created from a line spectral frequency (LSF) code book, thus ensuring linear distortion performance without the costly run-time complexity of finding roots to high-order LSP polynomials in the LSF domain.

Type: Grant

Filed: May 9, 2000

Date of Patent: November 26, 2002

Assignee: Seda Solutions Corp.

Inventor: Rahmin Soheili
Method and apparatus for voice communication

Patent number: 6480827

Abstract: A telecommunication system and method having a transmitter and receiver for voice encoded signals. The receiver has a speech post-processor connected as an element before conversion of the speech from digital form and delivery of the speech to a listener. The speech post-processor processes select sequences of speech signals of a predetermined duration, and obtains the most likely estimation of a speech sequence that contains unrecognized phonemes. The speech post-processor has a recognizer and parser that receives speech signals, parses them into corresponding phonemes or unrecognized phonemes. Speech sequences of preselected duration are selected, and processed through an execution trellis implemented by a Viterbi algorithm to obtain a most likely sequence estimation for sequences which contain unrecognized phonemes, and determined phonemes replace the unrecognized phonemes. Only speech sequences with unrecognized phonemes are directed to the execution trellis.

Type: Grant

Filed: March 7, 2000

Date of Patent: November 12, 2002

Assignee: Motorola, Inc.

Inventor: Oliver F. McDonald
Method and apparatus for hybrid coding of speech at 4KBPS having phase alignment between mode-switched frames

Patent number: 6475245

Abstract: A method and apparatus for encoding speech for communication to a decoder for reproduction of the speech where the speech signal is classified into steady state voiced (harmonic), stationary unvoiced, and “transitory” or “transition” speech, and a particular type of coding scheme is used for each class. Harmonic coding is used for steady state voiced speech, “noise-like” coding is used for stationary unvoiced speech, and a special coding mode is used for transition speech, designed to capture the location, the structure, and the strength of the local time events that characterize the transition portions of the speech. The compression schemes can be applied to the speech signal or to the LP residual signal.

Type: Grant

Filed: February 5, 2001

Date of Patent: November 5, 2002

Assignee: The Regents of the University of California

Inventors: Allen Gersho, Eyal Shlomot, Vladimir Cuperman, Chunyan Li
Real-time lossless encoding and decoding system by moving excess data amounts, and a method therefor

Patent number: 6477501

Abstract: A lossless encoding apparatus encodes audio data and a lossless decoding apparatus restores the losslessly compression encoded audio data on a real-time basis, and a method therefor. The lossless encoding apparatus includes a lossless compression unit which losslessly compression encodes the audio data stored in an input buffer in units of predetermined data and outputs the encoded data in sequence, and an output buffer which stores the encoded audio data output from the lossless compression unit.

Type: Grant

Filed: June 7, 2000

Date of Patent: November 5, 2002

Assignee: Samsung Electronics Co., Ltd.

Inventor: Jae-Hoon Heo
Text independent speaker recognition with simultaneous speech recognition for transparent command ambiguity resolution and continuous access control

Patent number: 6477500

Abstract: Feature vectors representing each of a plurality of overlapping frames of an arbitrary, text independent speech signal are computed and compared to vector parameters and variances stored as codewords in one or more codebooks corresponding to each of one or more enrolled users to provide speaker dependent information for speech recognition and/or ambiguity resolution. Other information such as aliases and preferences of each enrolled user may also be enrolled and stored, for example, in a database. Correspondence of the feature vectors may be ranked by closeness of correspondence to a codeword entry and the number of frames corresponding to each codebook are accumulated or counted to identify a potential enrolled speaker. The differences between the parameters of the feature vectors and codewords in the codebooks can be used to identify a new speaker and an enrollment procedure can be initiated.

Type: Grant

Filed: April 12, 2000

Date of Patent: November 5, 2002

Assignee: International Business Machines Corporation

Inventor: Stephane Herman Maes
Constant bitrate real-time lossless audio decoder restoring moved excess coded data amounts

Patent number: 6473736

Abstract: A lossless decoding apparatus restores the losslessly compression encoded audio data on a real-time basis, and a method therefor. The lossless decoding apparatus includes a restorer which restores losslessly compression encoded audio data. The encoded audio data has been divided into first data having a data amount exceeding the maximum bitrate and second data having a data amount less than the maximum bitrate, where the first data is divided into third data being the encoded audio data having a data amount of the maximum bitrate and fourth data being the encoded data of the portion exceeding the maximum bitrate, and where the fourth data is output together with the second data. A bitrate controller controls the input and output of the fourth data into or out of a buffer to combine the fourth data with the corresponding third data to be output to the restorer to be restored with the remaining restored audio data in sequence.

Type: Grant

Filed: July 19, 2001

Date of Patent: October 29, 2002

Assignee: Samsung Electronics Co., Ltd.

Inventor: Jae-Hoon Heo
Speech synthesis apparatus having prosody generator with user-set speech-rate- or adjusted phoneme-duration-dependent selective vowel devoicing

Patent number: 6470316

Abstract: The speech synthesis apparatus according to the present invention includes a text analyzer operable to generate a phonetic and prosodic symbol string from text information of an input text; a word dictionary storing a reading and accent of a word; a voice segment dictionary storing a phoneme that is a basic unit of speech; a prosody generator operable to generate synthesizing parameters including at least a phoneme, a duration of the phoneme and a fundamental frequency for the phonetic and prosodic symbol string, the prosody generator including a vowel devoicing determining means operable to determine whether or not a vowel devoicing process is to be performed and a duration modifying means operable to modify the duration of the phoneme depending on a speech rate set by a user, the vowel devoicing determining means determining that the vowel devoicing process is not performed when the set speech rate is slower than a predetermined rate; and a waveform generator operable to generate a synthesized waveform by m

Type: Grant

Filed: March 3, 2000

Date of Patent: October 22, 2002

Assignee: Oki Electric Industry Co., Ltd.

Inventor: Keiichi Chihara
Electrotactile vocoder using handset with stimulating electrodes

Patent number: 6466911

Abstract: An electrotactile vocoder includes a handset (3) carrying stimulating electrodes (9) positioned adjacent openings (8) in the handset and electrically contacting the fingers when the handset is worn to cause stimulation of the digital nerves of the fingers, a speech processor/stimulator unit (2) for producing electrical stimuli at the electrodes (9) based on incoming speech and other information received by a microphone (1), the stimulator unit including circuit means for applying stimulating currents to the electrodes (9), the speech processor unit including means for encoding the presence of unvoiced speech components or for encoding information to a first formant F1 in addition to information relating to a second formant F2 and for applying the stimulating currents to selected pairs of electrodes.

Type: Grant

Filed: February 2, 2000

Date of Patent: October 15, 2002

Assignee: The University of Melbourne

Inventors: Robert S C. Cowan, Karyn L. Galvin, Bich D. Lu, Rodney E. Millard
Audiophile encoding of digital audio data using 2-bit polarity/magnitude indicator and 8-bit scale factor for each subband

Patent number: 6463405

Abstract: A method, system and product are provided for loss-less encoding of a digital signal representing an audible sound. The method includes dividing the digital signal into a plurality of frames, dividing each of the plurality of frames into a plurality of subbands, and assigning each of the plurality of subbands an indicator selected from the group consisting of positive, zero, and negative, wherein the indicator selected is based on a polarity and a magnitude of the subband. The method further includes assigning each of the plurality of subbands one of a plurality of scale factors, wherein each scale factor represents a sound level range of at most two decibels, and generating a digital word for each of the plurality of frames, each digital word having a scale factor section including the scale factors for the plurality of subbands in the frame, and a sample data section including the indicators for the plurality of subbands in the frame.

Type: Grant

Filed: December 20, 1996

Date of Patent: October 8, 2002

Inventor: Eliot M. Case
Audio decoding device for decoding coded audio information with multiple channels

Patent number: 6460016

Abstract: An audio decoding device for decoding audio information with multiple channels includes a coded information memory section for storing the coded audio information; an information transmission section for reading the coded audio information stored at an arbitrary position in the coded information memory section; and an audio decoding section for decoding the coded audio information read by the information transmission section and outputting the resultant audio information in accordance with a time axis, wherein the information transmission section includes a buffer memory for retaining an address of an actual pointer for reading the coded audio information in the coded information memory section so as not to be reread, an address of a temporary pointer for reading the coded audio information in the coded information memory section so as to be reread, actual pointer data read by the actual pointer, and temporary pointer data read by the temporary pointer.

Type: Grant

Filed: October 10, 2000

Date of Patent: October 1, 2002

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Masahiro Sueyoshi, Shuji Miyasaka, Tukuru Ishito, Takeshi Fujita, Takashi Katayama, Masaharu Matsumoto, Tuyoshi Nakamura, Eiji Otomura, Akihisa Kawamura
Method of determining model-specific factors for pattern recognition, in particular for speech patterns

Patent number: 6456969

Abstract: A method for recognizing a pattern that comprises a set of physical stimuli, said method comprising the steps of: providing a set of training observations and through applying a plurality of association models ascertaining various measuring values pj(k|x), j=1 . . . M, that each pertain to assigning a particular training observation to one or more associated pattern classes; setting up a log/linear association distribution by combining all association models of the plurality according to respective weight factors, and joining thereto a normalization quantity to produce a compound association distribution; optimizing said weight factors for thereby minimizing a detected error rate of the actual assigning to said compound distribution; recognizing target observations representing a target pattern with the help of said compound distribution.

Type: Grant

Filed: August 10, 1999

Date of Patent: September 24, 2002

Assignee: U.S. Philips Corporation

Inventor: Peter Beyerlein
Task automation user interface with text-to-speech output

Patent number: 6456973

Abstract: In a computer system adapted for text-to-speech playback, a method for instructing a user in performing a task having a plurality of steps can include retrieving a textual instruction from a location in an electronic storage device of the computer system. The textual instruction can correspond to one or more of the steps in the task. The textual instruction can be displayed in a task automation user interface, and a text-to-speech (TTS) conversion of the textual instruction can be executed. The steps can be repeated until all textual instructions corresponding to each step in the task have been retrieved and TTS converted.

Type: Grant

Filed: October 12, 1999

Date of Patent: September 24, 2002

Assignee: International Business Machines Corp.

Inventors: Frank Fado, Peter J. Guasti, Amado Nassiff, Harvey Ruback, Ronald E. VanBuskirk
Method and device for recognizing at least one keyword in spoken speech using a computer

Patent number: 6453293

Abstract: In order to recognize a keyword in spoken speech, the keyword is subdivided, just like a test pattern to be recognized, into segments. The individual segments of the keyword and of the test pattern are mapped onto one another and a comparison measurement is made and is accumulated over the segments. The keyword is repeatedly stored in a training phase, a plurality of reference features respectively being determined for each segment of the keyword are stored. During recognition, a reference feature wich best fits the relevant segment of the test pattern is assigned in a segment-wise fashion.

Type: Grant

Filed: November 15, 2000

Date of Patent: September 17, 2002

Assignee: Siemens Aktiengesellschaft

Inventor: Bernhard Kämmerer
Intelligent text-to-speech synthesis

Patent number: 6446040

Abstract: A method and an apparatus of synthesizing speech from a piece of input text 104. In one embodiment, the method includes the steps of retrieving the input text 104 entered into a computing system, and transforming the input text 104 based on the semantics 152 of at least one word of the input text 104 to generate a formatted text 108 for speech synthesis. The transforming includes adding an audio rendering effect tot he input text based on the semantics of at least one work, the audio rendering effect comprising background music, special effect sounds, and context-sensitive sounds. In another embodiment, the transformation also depends on at least one characteristic of the person listening to the speech output 118. In yet another embodiment, the transformation further depends on at least one characteristic of the hardware employed by the user to listen to the speech output 118. The transformed text can be further modified to fit a text-to-speech engine to generate the speech output 118.

Type: Grant

Filed: June 17, 1998

Date of Patent: September 3, 2002

Assignee: Yahoo&excl; Inc.

Inventors: Gudrun Socher, Mohan Vishwanath, Anurag Mendhekar
Method for the auditory navigation of text

Patent number: 6442523

Abstract: A method of navigating textual information via auditory indicators is provided. Through the pervasive and immediate articulation of virtually all textual elements in response to a comparatively passive action, such as a selection device rollover, a user may quickly peruse titles, headings, list items, and so on, as well as emphasized text, paragraphs, captions, and virtually any unit of visually contiguous text. A user may also hear a particular selected word via a slightly more active action, such as clicking a mouse button. Via use of this method, a child, or other user who may understand a language, but not be able to recognize its orthography, may successfully and easily navigate textual documents.

Type: Grant

Filed: May 16, 2000

Date of Patent: August 27, 2002

Inventor: Steven H. Siegel
Scalable audio coding/decoding method and apparatus

Patent number: 6438525

Abstract: A scalable audio coding/decoding method and apparatus are provided. The coding method includes the steps of (a) signal-processing input audio signals and quantizing the same for each predetermined coding band, (b) coding the quantized data corresponding to the base layer within a predetermined layer size, (c) coding the quantized data corresponding to the next enhancement layer of the coded base layer and the remaining quantized data uncoded and belonging to the enhancement layer, within a predetermined layer size, and (d) sequentially performing the layer coding steps for all layers, wherein the steps (b), (c) and (d) each include the steps of (e) representing the quantized data corresponding to a layer to be coded by digits of a predetermined same number, and (f) coding the most significant digit sequences composed of most significant digits of the magnitude data composing the represented digital data.

Type: Grant

Filed: July 7, 2000

Date of Patent: August 20, 2002

Assignee: Samsung Electronics Co., Ltd.

Inventor: Sung-hee Park
Multi-stage pitch and mixed voicing estimation for harmonic speech coders

Patent number: 6438517

Abstract: A “multi-stage” method of estimating pitch in a speech encoder (FIG. 2). In a first stage of the method, a set of candidate pitch values is selected, such as by using a cost function that operates on said speech signal (steps 21-23). In a second stage of the method, a best candidate is selected. Specifically, in the second stage, pitch values calculated from previous speech segments are used to calculate an average pitch value (step 25). Then, depending on whether the average pitch value is short or long, one of two different analysis-by-synthesis (ABS) processes is then repeated for each candidate, such that for each iteration, a synthesized signal is derived from that pitch candidate and compared to a reference signal to provide an error value. A time domain ABS process is used if the average pitch is short (step 27), whereas a frequency domain ABS process is used if the average pitch is long (step 28).

Type: Grant

Filed: April 27, 2000

Date of Patent: August 20, 2002

Assignee: Texas Instruments Incorporated

Inventor: Suat Yeldener
System and method for referencing object instances and invoking methods on those object instances from within a speech recognition grammar

Patent number: 6434529

Abstract: A system and method for referencing object instances of an application program, and invoking methods on those object instances from within a recognition grammar. A mapping is maintained between at least one string formed using characters in the character set of the recognition grammar and instances of objects in the application program. During operation of the disclosed system, when either the application program or script within a recognition grammar creates an application object instance, a reference to the object instance is added to the mapping table, together with an associated unique string. The unique string may then be used within scripting language in tags of the rule grammar, in order to refer to the object instance that has been “registered” by the application program in this way. A tags parser program may be used to interpret such object instance names while interpreting the scripting language contained in tags included in a recognition result object.

Type: Grant

Filed: February 16, 2000

Date of Patent: August 13, 2002

Assignee: Sun Microsystems, Inc.

Inventors: William D. Walker, Andrew J. Hunt, Stuart J. Adams
Automatically determining words for updating in a pronunciation dictionary in a speech recognition system

Patent number: 6434521

Abstract: An approach for automatically determining the accuracy of a pronunciation dictionary in a speech recognition system involves comparing an expected pronunciation representation for a particular word from a pronunciation dictionary to one or more actual pronunciations of the particular word. An accuracy score for each of the phonemes that constitute the pronunciation of the particular word is determined from the comparison of the expected and actual pronunciations for the particular word. The accuracy score is evaluated against specified accuracy criteria to determine whether the expected pronunciation for the particular word satisfies the specified accuracy criteria. If the expected pronunciation does not satisfy the specified accuracy criteria for the particular word, then the expected pronunciation for the particular word in the pronunciation dictionary is identified as requiring updating.

Type: Grant

Filed: June 24, 1999

Date of Patent: August 13, 2002

Assignee: SpeechWorks International, Inc.

Inventor: Etienne Barnard

prev 1 2 3 4 5 6 7 next