Patents by Inventor Chi-Lin Shih
Chi-Lin Shih has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7149690Abstract: A method and apparatus for interactive language instruction is provided that displays text files for processing, provide key features and functions for interactive learning, displays facial animation, and provides a workspace for language building functions. The system includes a stored set of language rules as part of the text-to-speech sub-system, as well as another stored set of rules as applied to the process of learning a language. The method implemented by the system includes digitally converting text to audible speech, providing the audible speech to a user or student (with the aid of an animated image in selected circumstances), prompting the student to replicate the audible speech, comparing the student's replication with the audible speech provided by the system, and providing feedback and reinforcement to the student by, for example, selectively recording or playing back the audible speech and the student's replication.Type: GrantFiled: September 9, 1999Date of Patent: December 12, 2006Assignee: Lucent Technologies Inc.Inventors: Katherine Grace August, Nadine Blackwood, Qi P. Li, Michelle McNerney, Chi-Lin Shih, Arun Chandrasekaran Surendran, Jialin Zhong, Qiru Zhou
-
Patent number: 6856958Abstract: Techniques are described for employing a set of tags to model phenomena which are smooth and subject to constraints. Tags may be used to model, for example, muscular movement producing speech. In one advantageous application, a set of tags defining prosodic characteristics is developed, and selected tags are placed in appropriate locations of a body of text. Each tag defines a constraint on the prosodic characteristics of speech produced by processing the text. Processing of the body of speech and the tags produces a set of equations which are solved to produce a curve defining prosodic characteristics over the scope of a phrase, and a further set of equations which are solved to produce a curve defining prosodic characteristics of individual words within a phrase. The data defined by the curves is used with the text to produce speech having the prosodic characteristics defined by the tags.Type: GrantFiled: April 30, 2001Date of Patent: February 15, 2005Assignee: Lucent Technologies Inc.Inventors: Gregory P. Kochanski, Chi-Lin Shih
-
Patent number: 6813604Abstract: A text to speech system modeling durational characteristics of a target speaker is addressed herein. A body of target speaker training text is selected having maximum possible information about speaker specific characteristics. The body of target speaker training text is read by a target speaker to produce a target speaker training corpus. A previously generated source model reflecting characteristics of a source model is retrieved and the target speaker training corpus is processed to produce modification parameters reflecting differences between durational characteristics of the target speaker and those predicted by the source model. The modification parameters are applied to the source model to produce a target model. Text inputs are processed using the target model to produce speech outputs reflecting durational characteristics of the target speaker.Type: GrantFiled: November 13, 2000Date of Patent: November 2, 2004Assignee: Lucent Technologies Inc.Inventors: Chi-Lin Shih, Jan Pieter Hendrik van Santen
-
Patent number: 6810378Abstract: A method and apparatus for synthesizing speech from text whereby the speech may be generated in a manner so as to effectively convey a particular, selectable style. Repeated patterns of one or more prosodic features—such as, for example, pitch, amplitude, spectral tilt, and/or duration—occurring at characteristic locations in the synthesized speech, are advantageously used to convey a particular chosen style. For example, one or more of such feature patterns may be used to define a particular speaking style, and an illustrative text-to-speech system then makes use of such a defined style to adjust the specified parameter or parameters of the synthesized speech in a non-uniform manner (i.e., in accordance with the defined feature pattern or patterns).Type: GrantFiled: September 24, 2001Date of Patent: October 26, 2004Assignee: Lucent Technologies Inc.Inventors: Gregory P. Kochanski, Chi-Lin Shih
-
Publication number: 20030204569Abstract: E-mail which may be infected by a computer virus is advantageously filtered by incorporating a “Reverse Turing Test” to verify that the source of a potentially infected e-mail is human and not a machine, and that the message was intentionally transmitted by the apparent sender. Such a test may, for example, involve asking a question which will be easy for a human to answer correctly but quite difficult for a machine to do so. The e-mail may be deemed to be potentially infected based on an analysis of executable code which is attached to the e-mail, or merely based on the fact that executable code is attached. The e-mail may also be deemed to be potentially infected based on additional factors, such as, for example, the identity of the sender and past experiences therewith. Spam E-mail may also be advantageously filtered together with virus-containing e-mail with use of a single common filtering system.Type: ApplicationFiled: April 29, 2002Publication date: October 30, 2003Inventors: Michael R. Andrews, Gregory P. Kochanski, Daniel Philip Lopresti, Chi-Lin Shih
-
Patent number: 6625576Abstract: A method and apparatus for performing text-to-speech conversion in a client/server environment partitions an otherwise conventional text-to-speech conversion algorithm into two portions: a first “text analysis” portion, which generates from an original input text an intermediate representation thereof and a second “speech synthesis” portion, which synthesizes speech waveforms from the intermediate representation generated by the first portion (i.e., the text analysis portion) The text analysis portion of the algorithm is executed exclusively on a server while the speech synthesis portion is executed exclusively on a client which may be associated therewith. The client may comprise a hand-held device such as, for example, a cell phone, and the intermediate representation of the input text advantageously comprises at least a sequence of phonemes representative of the input text.Type: GrantFiled: January 29, 2001Date of Patent: September 23, 2003Assignee: Lucent Technologies Inc.Inventors: Gregory P. Kochanski, Joseph Philip Olive, Chi-Lin Shih
-
Publication number: 20030078780Abstract: A method and apparatus for synthesizing speech from text whereby the speech may be generated in a manner so as to effectively convey a particular, selectable style. Repeated patterns of one or more prosodic features—such as, for example, pitch, amplitude, spectral tilt, and/or duration—occurring at characteristic locations in the synthesized speech, are advantageously used to convey a particular chosen style. For example, one or more of such feature patterns may be used to define a particular speaking style, and an illustrative text-to-speech system then makes use of such a defined style to adjust the specified parameter or parameters of the synthesized speech in a non-uniform manner (i.e., in accordance with the defined feature pattern or patterns).Type: ApplicationFiled: September 24, 2001Publication date: April 24, 2003Inventors: Gregory P. Kochanski, Chi-Lin Shih
-
Publication number: 20030028378Abstract: A method and apparatus for interactive language instruction is provided that displays text files for processing, provide key features and functions for interactive learning, displays facial animation, and provides a workspace for language building functions. The system includes a stored set of language rules as part of the text-to-speech sub-system, as well as another stored set of rules as applied to the process of learning a language. The method implemented by the system includes digitally converting text to audible speech, providing the audible speech to a user or student (with the aid of an animated image in selected circumstances), prompting the student to replicate the audible speech, comparing the student's replication with the audible speech provided by the system, and providing feedback and reinforcement to the student by, for example, selectively recording or playing back the audible speech and the student's replication.Type: ApplicationFiled: September 9, 1999Publication date: February 6, 2003Inventors: KATHERINE GRACE AUGUST, NADINE BLACKWOOD, QI P. LI, MICHELLE MCNERNEY, CHI-LIN SHIH, ARUN CHANDRASEKARAN SURENDRAN, JIALIN ZHONG, QIRU ZHOU
-
Publication number: 20030009338Abstract: Techniques are described for employing a set of tags to model phenomena which are smooth and subject to constraints. Tags may be used to model, for example, muscular movement producing speech. In one advantageous application, a set of tags defining prosodic characteristics is developed, and selected tags are placed in appropriate locations of a body of text. Each tag defines a constraint on the prosodic characteristics of speech produced by processing the text. Processing of the body of speech and the tags produces a set of equations which are solved to produce a curve defining prosodic characteristics over the scope of a phrase, and a further set of equations which are solved to produce a curve defining prosodic characteristics of individual words within a phrase. The data defined by the curves is used with the text to produce speech having the prosodic characteristics defined by the tags.Type: ApplicationFiled: April 30, 2001Publication date: January 9, 2003Inventors: Gregory P. Kochanski, Chi-Lin Shih
-
Publication number: 20020103646Abstract: A method and apparatus for performing text-to-speech conversion in a client/server environment partitions an otherwise conventional text-to-speech conversion algorithm into two portions: a first “text analysis” portion, which generates from an original input text an intermediate representation thereof; and a second “speech synthesis” portion, which synthesizes speech waveforms from the intermediate representation generated by the first portion (i.e., the text analysis portion). The text analysis portion of the algorithm is executed exclusively on a server while the speech synthesis portion is executed exclusively on a client which may be associated therewith. The client may comprise a hand-held device such as, for example, a cell phone, and the intermediate representation of the input text advantageously comprises at least a sequence of phonemes representative of the input text.Type: ApplicationFiled: January 29, 2001Publication date: August 1, 2002Inventors: Gregory P. Kochanski, Joseph Philip Olive, Chi-Lin Shih
-
Patent number: 6272464Abstract: Multiple, yet plausible, pronunciations of a proper name are generated based on one or more potential language origins of the name, and based further on the context in which the name is being spoken—namely, on characteristics of the population of potential speakers. Conventional techniques may be employed to identify likely candidates for the language origin of the name, and the characteristics of the speaker population on which the generation of the pronunciations is further based may comprise, for example, the national origin of the speakers, the purpose of the speech, the geographical location of the speakers, or the general level of sophistication of the speaker population.Type: GrantFiled: March 27, 2000Date of Patent: August 7, 2001Assignee: Lucent Technologies Inc.Inventors: George A Kiraz, Joseph Philip Olive, Chi-Lin Shih