Patents by Inventor Chang-Ning Huang
Chang-Ning Huang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7974963Abstract: A method, computer readable medium and system are provided which retrieve confirming sentences from a sentence database in response to a query. A search engine retrieves confirming sentences from the sentence database in response to the query. IN retrieving the confirming sentences, the search engine defines indexing units based upon the query, with the indexing units including both lemma from the query and extended indexing units associated with the query. The search engine then retrieves a plurality of sentences from the sentence database using the defined indexing units as search parameters. A similarity between each of the plurality of retrieved sentences and the query is determined by the search engine, wherein each similarity is determined as a function of a linguistic weight of a term in the query. The search engine then ranks the plurality of retrieved sentences based upon the determined similarities.Type: GrantFiled: July 22, 2005Date of Patent: July 5, 2011Inventors: Ming Zhou, Hua Wu, Yue Zhang, Jianfeng Gao, Chang-Ning Huang
-
Patent number: 7672832Abstract: A method is disclosed for providing a chunking utility that supports robust natural language processing. A corpus is chunked in accordance with a draft chunking specification. Chunk inconsistencies in the corpus are automatically flagged for resolution, and a chunking utility is provided in which at least some of the flagged inconsistencies are resolved. The chunking utility provides a single, consistent global chunking standard, ensuring compatibility among various applications. The chunking utility is particularly advantageous for non-alphabetic languages, such as Chinese.Type: GrantFiled: February 1, 2006Date of Patent: March 2, 2010Assignee: Microsoft CorporationInventors: Chang-Ning Huang, Hong-Qiao Li, Jianfeng Gao
-
Patent number: 7496501Abstract: A system and method identify base noun phrases (baseNP) in a linguistic input. A part-of-speech tagger identifies N-best part-of-speech tag sequences corresponding to the linguistic input. A baseNP identifier identifies baseNPs in the linguistic input using a unified statistical model that identifies the baseNPs, given the N-best POS sequences.Type: GrantFiled: October 29, 2004Date of Patent: February 24, 2009Assignee: Microsoft CorporationInventors: Endong Xun, Ming Zhou, Chang-Ning Huang
-
Patent number: 7493251Abstract: A method and apparatus for segmenting text is provided that identifies a sequence of entity types from a sequence of characters and thereby identifies a segmentation for the sequence of characters. Under the invention, the sequence of entity types is identified using probabilistic models that describe the likelihood of a sequence of entities and the likelihood of sequences of characters given particular entities. Under one aspect of the invention, organization name entities are identified from a first sequence of identified entities to form a final sequence of identified entities.Type: GrantFiled: May 30, 2003Date of Patent: February 17, 2009Assignee: Microsoft CorporationInventors: Jianfeng Gao, Mu Li, Chang-Ning Huang, Jian Sun, Lei Zhang, Ming Zhou
-
Patent number: 7343280Abstract: The present invention deals with noisy data not by eliminating low frequency dependency structures, but rather by weighting the dependency structures. The dependency structures are weighted to give less weight to dependency structures which are more likely incorrect and to give more weight to dependency structures which are more likely correct.Type: GrantFiled: July 1, 2003Date of Patent: March 11, 2008Assignee: Microsoft CorporationInventors: Hua Wu, Ming Zhou, Chang-Ning Huang
-
Publication number: 20070282592Abstract: A method is disclosed for providing a chunking utility that supports robust natural language processing. A corpus is chunked in accordance with a draft chunking specification. Chunk inconsistencies in the corpus are automatically flagged for resolution, and a chunking utility is provided in which at least some of the flagged inconsistencies are resolved. The chunking utility provides a single, consistent global chunking standard, ensuring compatibility among various applications. The chunking utility is particularly advantageous for non-alphabetic languages, such as Chinese.Type: ApplicationFiled: February 1, 2006Publication date: December 6, 2007Applicant: Microsoft CorporationInventors: Chang-Ning Huang, Hong-Qiao Li, Jianfeng Gao
-
Publication number: 20070078644Abstract: Segmentation error candidates are detected using segmentation variations found in an annotated corpus.Type: ApplicationFiled: September 30, 2005Publication date: April 5, 2007Applicant: Microsoft CorporationInventors: Chang-Ning Huang, Jianfeng Gao, Mu Li
-
Patent number: 7194455Abstract: A method, computer readable medium and system are provided which retrieve confirming sentences from a sentence database in response to a query. A search engine retrieves confirming sentences from the sentence database in response to the query. IN retrieving the confirming sentences, the search engine defines indexing units based upon the query, with the indexing units including both lemma from the query and extended indexing units associated with the query. The search engine then retrieves a plurality of sentences from the sentence database using the defined indexing units as search parameters. A similarity between each of the plurality of retrieved sentences and the query is determined by the search engine, wherein each similarity is determined as a function of a linguistic weight of a term in the query. The search engine then ranks the plurality of retrieved sentences based upon the determined similarities.Type: GrantFiled: September 19, 2002Date of Patent: March 20, 2007Assignee: Microsoft CorporationInventors: Ming Zhou, Hua Wu, Yue Zhang, Jianfeng Gao, Chang-Ning Huang
-
Publication number: 20050273318Abstract: A method, computer readable medium and system are provided which retrieve confirming sentences from a sentence database in response to a query. A search engine retrieves confirming sentences from the sentence database in response to the query. IN retrieving the confirming sentences, the search engine defines indexing units based upon the query, with the indexing units including both lemma from the query and extended indexing units associated with the query. The search engine then retrieves a plurality of sentences from the sentence database using the defined indexing units as search parameters. A similarity between each of the plurality of retrieved sentences and the query is determined by the search engine, wherein each similarity is determined as a function of a linguistic weight of a term in the query. The search engine then ranks the plurality of retrieved sentences based upon the determined similarities.Type: ApplicationFiled: July 22, 2005Publication date: December 8, 2005Applicant: Microsoft CorporationInventors: Ming Zhou, Hua Wu, Yue Zhang, Jianfeng Gao, Chang-Ning Huang
-
Patent number: 6904402Abstract: A method for optimizing a language model is presented comprising developing an initial language model from a lexicon and segmentation derived from a received corpus using a maximum match technique, and iteratively refining the initial language model by dynamically updating the lexicon and re-segmenting the corpus according to statistical principles until a threshold of predictive capability is achieved.Type: GrantFiled: June 30, 2000Date of Patent: June 7, 2005Assignee: Microsoft CorporationInventors: Hai-Feng Wang, Chang-Ning Huang, Kai-Fu Lee, Shuo Di, Jianfeng Gao, Dong-Feng Cai, Lee-Feng Chien
-
Publication number: 20050071149Abstract: A system and method identify base noun phrases (baseNP) in a linguistic input. A part-of-speech tagger identifies N-best part-of-speech tag sequences corresponding to the linguistic input. A baseNP identifier identifies baseNPs in the linguistic input using a unified statistical model that identifies the baseNPs, given the N-best POS sequences.Type: ApplicationFiled: October 29, 2004Publication date: March 31, 2005Applicant: Microsoft CorporationInventors: Endong Xun, Ming Zhou, Chang-Ning Huang
-
Publication number: 20050071148Abstract: The present invention relates to a corpus for use in training a language model. The corpus includes a plurality of characters and a plurality of morphological tags associated with a plurality of sequences of characters. The plurality of morphological tags indicate a morphological type of an associated sequence of characters and a combination of parts forming a morphological subtype.Type: ApplicationFiled: September 15, 2003Publication date: March 31, 2005Applicant: Microsoft CorporationInventors: Chang-Ning Huang, Jianfeng Gao, Mu Li, Ashley Chang
-
Patent number: 6859771Abstract: A system and method identify base noun phrases (baseNP) in a linguistic input. A part-of-speech tagger identifies N-best part-of-speech tag sequences corresponding to the linguistic input. A baseNP identifier identifies baseNPs in the linguistic input using a unified statistical model that identifies the baseNPs, given the N-best POS sequences.Type: GrantFiled: June 4, 2001Date of Patent: February 22, 2005Assignee: Microsoft CorporationInventors: Endong Xun, Ming Zhou, Chang-Ning Huang
-
Publication number: 20050004790Abstract: The present invention deals with noisy data not by eliminating low frequency dependency structures, but rather by weighting the dependency structures. The dependency structures are weighted to give less weight to dependency structures which are more likely incorrect and to give more weight to dependency structures which are more likely correct.Type: ApplicationFiled: July 1, 2003Publication date: January 6, 2005Applicant: Microsoft CorporationInventors: Hua Wu, Ming Zhou, Chang-Ning Huang
-
Publication number: 20040243408Abstract: A method and apparatus for segmenting text is provided that identifies a sequence of entity types from a sequence of characters and thereby identifies a segmentation for the sequence of characters. Under the invention, the sequence of entity types is identified using probabilistic models that describe the likelihood of a sequence of entities and the likelihood of sequences of characters given particular entities. Under one aspect of the invention, organization name entities are identified from a first sequence of identified entities to form a final sequence of identified entities.Type: ApplicationFiled: May 30, 2003Publication date: December 2, 2004Applicant: Microsoft CorporationInventors: Jianfeng Gao, Mu Li, Chang-Ning Huang, Jian Sun, Lei Zhang, Ming Zhou
-
Publication number: 20040210434Abstract: A method for optimizing a language model is presented comprising developing an initial language model from a lexicon and segmentation derived from a received corpus using a maximum match technique, and iteratively refining the initial language model by dynamically updating the lexicon and re-segmenting the corpus according to statistical principles until a threshold of predictive capability is achieved.Type: ApplicationFiled: May 10, 2004Publication date: October 21, 2004Applicant: Microsoft CorporationInventors: Hai-Feng Wang, Chang-Ning Huang, Kai-Fu Lee, Shuo Di, Jianfeng Gao, Dong-Feng Cai, Lee-Feng Chien
-
Publication number: 20040059718Abstract: A method, computer readable medium and system are provided which retrieve confirming sentences from a sentence database in response to a query. A search engine retrieves confirming sentences from the sentence database in response to the query. IN retrieving the confirming sentences, the search engine defines indexing units based upon the query, with the indexing units including both lemma from the query and extended indexing units associated with the query. The search engine then retrieves a plurality of sentences from the sentence database using the defined indexing units as search parameters. A similarity between each of the plurality of retrieved sentences and the query is determined by the search engine, wherein each similarity is determined as a function of a linguistic weight of a term in the query. The search engine then ranks the plurality of retrieved sentences based upon the determined similarities.Type: ApplicationFiled: September 19, 2002Publication date: March 25, 2004Inventors: Ming Zhou, Hua Wu, Yue Zhang, Jianfeng Gao, Chang-Ning Huang
-
Publication number: 20040002848Abstract: The present invention performs machine translation by matching fragments of a source language sentence to be translated to source language portions of an example in example base. When all relevant examples have been identified in the example base, the examples are subjected to phrase alignment in which fragments of the target language sentence in each example are aligned against the matched fragments of the source language sentence in the same example. A translation component then substitutes the aligned target language phrases from the matched examples for the matched fragments in the source language sentence.Type: ApplicationFiled: June 28, 2002Publication date: January 1, 2004Inventors: Ming Zhou, Jin-Xia Huang, Chang Ning Huang, Wei Wang
-
Publication number: 20030014238Abstract: A system and method identify base noun phrases (baseNP) in a linguistic input. A part-of-speech tagger identifies N-best part-of-speech tag sequences corresponding to the linguistic input. A baseNP identifier identifies baseNPs in the linguistic input using a unified statistical model that identifies the baseNPs, given the N-best POS sequences.Type: ApplicationFiled: June 4, 2001Publication date: January 16, 2003Inventors: Endong Xun, Ming Zhou, Chang-Ning Huang