Patents by Inventor Chi-Lin Shih

Chi-Lin Shih has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and apparatus for interactive language instruction

Patent number: 7149690

Abstract: A method and apparatus for interactive language instruction is provided that displays text files for processing, provide key features and functions for interactive learning, displays facial animation, and provides a workspace for language building functions. The system includes a stored set of language rules as part of the text-to-speech sub-system, as well as another stored set of rules as applied to the process of learning a language. The method implemented by the system includes digitally converting text to audible speech, providing the audible speech to a user or student (with the aid of an animated image in selected circumstances), prompting the student to replicate the audible speech, comparing the student's replication with the audible speech provided by the system, and providing feedback and reinforcement to the student by, for example, selectively recording or playing back the audible speech and the student's replication.

Type: Grant

Filed: September 9, 1999

Date of Patent: December 12, 2006

Assignee: Lucent Technologies Inc.

Inventors: Katherine Grace August, Nadine Blackwood, Qi P. Li, Michelle McNerney, Chi-Lin Shih, Arun Chandrasekaran Surendran, Jialin Zhong, Qiru Zhou
Methods and apparatus for text to speech processing using language independent prosody markup

Patent number: 6856958

Abstract: Techniques are described for employing a set of tags to model phenomena which are smooth and subject to constraints. Tags may be used to model, for example, muscular movement producing speech. In one advantageous application, a set of tags defining prosodic characteristics is developed, and selected tags are placed in appropriate locations of a body of text. Each tag defines a constraint on the prosodic characteristics of speech produced by processing the text. Processing of the body of speech and the tags produces a set of equations which are solved to produce a curve defining prosodic characteristics over the scope of a phrase, and a further set of equations which are solved to produce a curve defining prosodic characteristics of individual words within a phrase. The data defined by the curves is used with the text to produce speech having the prosodic characteristics defined by the tags.

Type: Grant

Filed: April 30, 2001

Date of Patent: February 15, 2005

Assignee: Lucent Technologies Inc.

Inventors: Gregory P. Kochanski, Chi-Lin Shih
Methods and apparatus for speaker specific durational adaptation

Patent number: 6813604

Abstract: A text to speech system modeling durational characteristics of a target speaker is addressed herein. A body of target speaker training text is selected having maximum possible information about speaker specific characteristics. The body of target speaker training text is read by a target speaker to produce a target speaker training corpus. A previously generated source model reflecting characteristics of a source model is retrieved and the target speaker training corpus is processed to produce modification parameters reflecting differences between durational characteristics of the target speaker and those predicted by the source model. The modification parameters are applied to the source model to produce a target model. Text inputs are processed using the target model to produce speech outputs reflecting durational characteristics of the target speaker.

Type: Grant

Filed: November 13, 2000

Date of Patent: November 2, 2004

Assignee: Lucent Technologies Inc.

Inventors: Chi-Lin Shih, Jan Pieter Hendrik van Santen
Method and apparatus for controlling a speech synthesis system to provide multiple styles of speech

Patent number: 6810378

Abstract: A method and apparatus for synthesizing speech from text whereby the speech may be generated in a manner so as to effectively convey a particular, selectable style. Repeated patterns of one or more prosodic features—such as, for example, pitch, amplitude, spectral tilt, and/or duration—occurring at characteristic locations in the synthesized speech, are advantageously used to convey a particular chosen style. For example, one or more of such feature patterns may be used to define a particular speaking style, and an illustrative text-to-speech system then makes use of such a defined style to adjust the specified parameter or parameters of the synthesized speech in a non-uniform manner (i.e., in accordance with the defined feature pattern or patterns).

Type: Grant

Filed: September 24, 2001

Date of Patent: October 26, 2004

Assignee: Lucent Technologies Inc.

Inventors: Gregory P. Kochanski, Chi-Lin Shih
Method and apparatus for filtering e-mail infected with a previously unidentified computer virus

Publication number: 20030204569

Abstract: E-mail which may be infected by a computer virus is advantageously filtered by incorporating a “Reverse Turing Test” to verify that the source of a potentially infected e-mail is human and not a machine, and that the message was intentionally transmitted by the apparent sender. Such a test may, for example, involve asking a question which will be easy for a human to answer correctly but quite difficult for a machine to do so. The e-mail may be deemed to be potentially infected based on an analysis of executable code which is attached to the e-mail, or merely based on the fact that executable code is attached. The e-mail may also be deemed to be potentially infected based on additional factors, such as, for example, the identity of the sender and past experiences therewith. Spam E-mail may also be advantageously filtered together with virus-containing e-mail with use of a single common filtering system.

Type: Application

Filed: April 29, 2002

Publication date: October 30, 2003

Inventors: Michael R. Andrews, Gregory P. Kochanski, Daniel Philip Lopresti, Chi-Lin Shih
Method and apparatus for performing text-to-speech conversion in a client/server environment

Patent number: 6625576

Abstract: A method and apparatus for performing text-to-speech conversion in a client/server environment partitions an otherwise conventional text-to-speech conversion algorithm into two portions: a first “text analysis” portion, which generates from an original input text an intermediate representation thereof and a second “speech synthesis” portion, which synthesizes speech waveforms from the intermediate representation generated by the first portion (i.e., the text analysis portion) The text analysis portion of the algorithm is executed exclusively on a server while the speech synthesis portion is executed exclusively on a client which may be associated therewith. The client may comprise a hand-held device such as, for example, a cell phone, and the intermediate representation of the input text advantageously comprises at least a sequence of phonemes representative of the input text.

Type: Grant

Filed: January 29, 2001

Date of Patent: September 23, 2003

Assignee: Lucent Technologies Inc.

Inventors: Gregory P. Kochanski, Joseph Philip Olive, Chi-Lin Shih
Method and apparatus for controlling a speech synthesis system to provide multiple styles of speech

Publication number: 20030078780

Abstract: A method and apparatus for synthesizing speech from text whereby the speech may be generated in a manner so as to effectively convey a particular, selectable style. Repeated patterns of one or more prosodic features—such as, for example, pitch, amplitude, spectral tilt, and/or duration—occurring at characteristic locations in the synthesized speech, are advantageously used to convey a particular chosen style. For example, one or more of such feature patterns may be used to define a particular speaking style, and an illustrative text-to-speech system then makes use of such a defined style to adjust the specified parameter or parameters of the synthesized speech in a non-uniform manner (i.e., in accordance with the defined feature pattern or patterns).

Type: Application

Filed: September 24, 2001

Publication date: April 24, 2003

Inventors: Gregory P. Kochanski, Chi-Lin Shih
METHOD AND APPARATUS FOR INTERACTIVE LANGUAGE INSTRUCTION

Publication number: 20030028378

Abstract: A method and apparatus for interactive language instruction is provided that displays text files for processing, provide key features and functions for interactive learning, displays facial animation, and provides a workspace for language building functions. The system includes a stored set of language rules as part of the text-to-speech sub-system, as well as another stored set of rules as applied to the process of learning a language. The method implemented by the system includes digitally converting text to audible speech, providing the audible speech to a user or student (with the aid of an animated image in selected circumstances), prompting the student to replicate the audible speech, comparing the student's replication with the audible speech provided by the system, and providing feedback and reinforcement to the student by, for example, selectively recording or playing back the audible speech and the student's replication.

Type: Application

Filed: September 9, 1999

Publication date: February 6, 2003

Inventors: KATHERINE GRACE AUGUST, NADINE BLACKWOOD, QI P. LI, MICHELLE MCNERNEY, CHI-LIN SHIH, ARUN CHANDRASEKARAN SURENDRAN, JIALIN ZHONG, QIRU ZHOU
Methods and apparatus for text to speech processing using language independent prosody markup

Publication number: 20030009338

Abstract: Techniques are described for employing a set of tags to model phenomena which are smooth and subject to constraints. Tags may be used to model, for example, muscular movement producing speech. In one advantageous application, a set of tags defining prosodic characteristics is developed, and selected tags are placed in appropriate locations of a body of text. Each tag defines a constraint on the prosodic characteristics of speech produced by processing the text. Processing of the body of speech and the tags produces a set of equations which are solved to produce a curve defining prosodic characteristics over the scope of a phrase, and a further set of equations which are solved to produce a curve defining prosodic characteristics of individual words within a phrase. The data defined by the curves is used with the text to produce speech having the prosodic characteristics defined by the tags.

Type: Application

Filed: April 30, 2001

Publication date: January 9, 2003

Inventors: Gregory P. Kochanski, Chi-Lin Shih
Method and apparatus for performing text-to-speech conversion in a client/server environment

Publication number: 20020103646

Abstract: A method and apparatus for performing text-to-speech conversion in a client/server environment partitions an otherwise conventional text-to-speech conversion algorithm into two portions: a first “text analysis” portion, which generates from an original input text an intermediate representation thereof; and a second “speech synthesis” portion, which synthesizes speech waveforms from the intermediate representation generated by the first portion (i.e., the text analysis portion). The text analysis portion of the algorithm is executed exclusively on a server while the speech synthesis portion is executed exclusively on a client which may be associated therewith. The client may comprise a hand-held device such as, for example, a cell phone, and the intermediate representation of the input text advantageously comprises at least a sequence of phonemes representative of the input text.

Type: Application

Filed: January 29, 2001

Publication date: August 1, 2002

Inventors: Gregory P. Kochanski, Joseph Philip Olive, Chi-Lin Shih
Method and apparatus for assembling a prediction list of name pronunciation variations for use during speech recognition

Patent number: 6272464

Abstract: Multiple, yet plausible, pronunciations of a proper name are generated based on one or more potential language origins of the name, and based further on the context in which the name is being spoken—namely, on characteristics of the population of potential speakers. Conventional techniques may be employed to identify likely candidates for the language origin of the name, and the characteristics of the speaker population on which the generation of the pronunciations is further based may comprise, for example, the national origin of the speakers, the purpose of the speech, the geographical location of the speakers, or the general level of sophistication of the speaker population.

Type: Grant

Filed: March 27, 2000

Date of Patent: August 7, 2001

Assignee: Lucent Technologies Inc.

Inventors: George A Kiraz, Joseph Philip Olive, Chi-Lin Shih