Patents by Inventor Chang-Ning Huang

Chang-Ning Huang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7974963
    Abstract: A method, computer readable medium and system are provided which retrieve confirming sentences from a sentence database in response to a query. A search engine retrieves confirming sentences from the sentence database in response to the query. IN retrieving the confirming sentences, the search engine defines indexing units based upon the query, with the indexing units including both lemma from the query and extended indexing units associated with the query. The search engine then retrieves a plurality of sentences from the sentence database using the defined indexing units as search parameters. A similarity between each of the plurality of retrieved sentences and the query is determined by the search engine, wherein each similarity is determined as a function of a linguistic weight of a term in the query. The search engine then ranks the plurality of retrieved sentences based upon the determined similarities.
    Type: Grant
    Filed: July 22, 2005
    Date of Patent: July 5, 2011
    Inventors: Ming Zhou, Hua Wu, Yue Zhang, Jianfeng Gao, Chang-Ning Huang
  • Patent number: 7672832
    Abstract: A method is disclosed for providing a chunking utility that supports robust natural language processing. A corpus is chunked in accordance with a draft chunking specification. Chunk inconsistencies in the corpus are automatically flagged for resolution, and a chunking utility is provided in which at least some of the flagged inconsistencies are resolved. The chunking utility provides a single, consistent global chunking standard, ensuring compatibility among various applications. The chunking utility is particularly advantageous for non-alphabetic languages, such as Chinese.
    Type: Grant
    Filed: February 1, 2006
    Date of Patent: March 2, 2010
    Assignee: Microsoft Corporation
    Inventors: Chang-Ning Huang, Hong-Qiao Li, Jianfeng Gao
  • Patent number: 7496501
    Abstract: A system and method identify base noun phrases (baseNP) in a linguistic input. A part-of-speech tagger identifies N-best part-of-speech tag sequences corresponding to the linguistic input. A baseNP identifier identifies baseNPs in the linguistic input using a unified statistical model that identifies the baseNPs, given the N-best POS sequences.
    Type: Grant
    Filed: October 29, 2004
    Date of Patent: February 24, 2009
    Assignee: Microsoft Corporation
    Inventors: Endong Xun, Ming Zhou, Chang-Ning Huang
  • Patent number: 7493251
    Abstract: A method and apparatus for segmenting text is provided that identifies a sequence of entity types from a sequence of characters and thereby identifies a segmentation for the sequence of characters. Under the invention, the sequence of entity types is identified using probabilistic models that describe the likelihood of a sequence of entities and the likelihood of sequences of characters given particular entities. Under one aspect of the invention, organization name entities are identified from a first sequence of identified entities to form a final sequence of identified entities.
    Type: Grant
    Filed: May 30, 2003
    Date of Patent: February 17, 2009
    Assignee: Microsoft Corporation
    Inventors: Jianfeng Gao, Mu Li, Chang-Ning Huang, Jian Sun, Lei Zhang, Ming Zhou
  • Patent number: 7343280
    Abstract: The present invention deals with noisy data not by eliminating low frequency dependency structures, but rather by weighting the dependency structures. The dependency structures are weighted to give less weight to dependency structures which are more likely incorrect and to give more weight to dependency structures which are more likely correct.
    Type: Grant
    Filed: July 1, 2003
    Date of Patent: March 11, 2008
    Assignee: Microsoft Corporation
    Inventors: Hua Wu, Ming Zhou, Chang-Ning Huang
  • Publication number: 20070282592
    Abstract: A method is disclosed for providing a chunking utility that supports robust natural language processing. A corpus is chunked in accordance with a draft chunking specification. Chunk inconsistencies in the corpus are automatically flagged for resolution, and a chunking utility is provided in which at least some of the flagged inconsistencies are resolved. The chunking utility provides a single, consistent global chunking standard, ensuring compatibility among various applications. The chunking utility is particularly advantageous for non-alphabetic languages, such as Chinese.
    Type: Application
    Filed: February 1, 2006
    Publication date: December 6, 2007
    Applicant: Microsoft Corporation
    Inventors: Chang-Ning Huang, Hong-Qiao Li, Jianfeng Gao
  • Publication number: 20070078644
    Abstract: Segmentation error candidates are detected using segmentation variations found in an annotated corpus.
    Type: Application
    Filed: September 30, 2005
    Publication date: April 5, 2007
    Applicant: Microsoft Corporation
    Inventors: Chang-Ning Huang, Jianfeng Gao, Mu Li
  • Patent number: 7194455
    Abstract: A method, computer readable medium and system are provided which retrieve confirming sentences from a sentence database in response to a query. A search engine retrieves confirming sentences from the sentence database in response to the query. IN retrieving the confirming sentences, the search engine defines indexing units based upon the query, with the indexing units including both lemma from the query and extended indexing units associated with the query. The search engine then retrieves a plurality of sentences from the sentence database using the defined indexing units as search parameters. A similarity between each of the plurality of retrieved sentences and the query is determined by the search engine, wherein each similarity is determined as a function of a linguistic weight of a term in the query. The search engine then ranks the plurality of retrieved sentences based upon the determined similarities.
    Type: Grant
    Filed: September 19, 2002
    Date of Patent: March 20, 2007
    Assignee: Microsoft Corporation
    Inventors: Ming Zhou, Hua Wu, Yue Zhang, Jianfeng Gao, Chang-Ning Huang
  • Publication number: 20050273318
    Abstract: A method, computer readable medium and system are provided which retrieve confirming sentences from a sentence database in response to a query. A search engine retrieves confirming sentences from the sentence database in response to the query. IN retrieving the confirming sentences, the search engine defines indexing units based upon the query, with the indexing units including both lemma from the query and extended indexing units associated with the query. The search engine then retrieves a plurality of sentences from the sentence database using the defined indexing units as search parameters. A similarity between each of the plurality of retrieved sentences and the query is determined by the search engine, wherein each similarity is determined as a function of a linguistic weight of a term in the query. The search engine then ranks the plurality of retrieved sentences based upon the determined similarities.
    Type: Application
    Filed: July 22, 2005
    Publication date: December 8, 2005
    Applicant: Microsoft Corporation
    Inventors: Ming Zhou, Hua Wu, Yue Zhang, Jianfeng Gao, Chang-Ning Huang
  • Patent number: 6904402
    Abstract: A method for optimizing a language model is presented comprising developing an initial language model from a lexicon and segmentation derived from a received corpus using a maximum match technique, and iteratively refining the initial language model by dynamically updating the lexicon and re-segmenting the corpus according to statistical principles until a threshold of predictive capability is achieved.
    Type: Grant
    Filed: June 30, 2000
    Date of Patent: June 7, 2005
    Assignee: Microsoft Corporation
    Inventors: Hai-Feng Wang, Chang-Ning Huang, Kai-Fu Lee, Shuo Di, Jianfeng Gao, Dong-Feng Cai, Lee-Feng Chien
  • Publication number: 20050071149
    Abstract: A system and method identify base noun phrases (baseNP) in a linguistic input. A part-of-speech tagger identifies N-best part-of-speech tag sequences corresponding to the linguistic input. A baseNP identifier identifies baseNPs in the linguistic input using a unified statistical model that identifies the baseNPs, given the N-best POS sequences.
    Type: Application
    Filed: October 29, 2004
    Publication date: March 31, 2005
    Applicant: Microsoft Corporation
    Inventors: Endong Xun, Ming Zhou, Chang-Ning Huang
  • Publication number: 20050071148
    Abstract: The present invention relates to a corpus for use in training a language model. The corpus includes a plurality of characters and a plurality of morphological tags associated with a plurality of sequences of characters. The plurality of morphological tags indicate a morphological type of an associated sequence of characters and a combination of parts forming a morphological subtype.
    Type: Application
    Filed: September 15, 2003
    Publication date: March 31, 2005
    Applicant: Microsoft Corporation
    Inventors: Chang-Ning Huang, Jianfeng Gao, Mu Li, Ashley Chang
  • Patent number: 6859771
    Abstract: A system and method identify base noun phrases (baseNP) in a linguistic input. A part-of-speech tagger identifies N-best part-of-speech tag sequences corresponding to the linguistic input. A baseNP identifier identifies baseNPs in the linguistic input using a unified statistical model that identifies the baseNPs, given the N-best POS sequences.
    Type: Grant
    Filed: June 4, 2001
    Date of Patent: February 22, 2005
    Assignee: Microsoft Corporation
    Inventors: Endong Xun, Ming Zhou, Chang-Ning Huang
  • Publication number: 20050004790
    Abstract: The present invention deals with noisy data not by eliminating low frequency dependency structures, but rather by weighting the dependency structures. The dependency structures are weighted to give less weight to dependency structures which are more likely incorrect and to give more weight to dependency structures which are more likely correct.
    Type: Application
    Filed: July 1, 2003
    Publication date: January 6, 2005
    Applicant: Microsoft Corporation
    Inventors: Hua Wu, Ming Zhou, Chang-Ning Huang
  • Publication number: 20040243408
    Abstract: A method and apparatus for segmenting text is provided that identifies a sequence of entity types from a sequence of characters and thereby identifies a segmentation for the sequence of characters. Under the invention, the sequence of entity types is identified using probabilistic models that describe the likelihood of a sequence of entities and the likelihood of sequences of characters given particular entities. Under one aspect of the invention, organization name entities are identified from a first sequence of identified entities to form a final sequence of identified entities.
    Type: Application
    Filed: May 30, 2003
    Publication date: December 2, 2004
    Applicant: Microsoft Corporation
    Inventors: Jianfeng Gao, Mu Li, Chang-Ning Huang, Jian Sun, Lei Zhang, Ming Zhou
  • Publication number: 20040210434
    Abstract: A method for optimizing a language model is presented comprising developing an initial language model from a lexicon and segmentation derived from a received corpus using a maximum match technique, and iteratively refining the initial language model by dynamically updating the lexicon and re-segmenting the corpus according to statistical principles until a threshold of predictive capability is achieved.
    Type: Application
    Filed: May 10, 2004
    Publication date: October 21, 2004
    Applicant: Microsoft Corporation
    Inventors: Hai-Feng Wang, Chang-Ning Huang, Kai-Fu Lee, Shuo Di, Jianfeng Gao, Dong-Feng Cai, Lee-Feng Chien
  • Publication number: 20040059718
    Abstract: A method, computer readable medium and system are provided which retrieve confirming sentences from a sentence database in response to a query. A search engine retrieves confirming sentences from the sentence database in response to the query. IN retrieving the confirming sentences, the search engine defines indexing units based upon the query, with the indexing units including both lemma from the query and extended indexing units associated with the query. The search engine then retrieves a plurality of sentences from the sentence database using the defined indexing units as search parameters. A similarity between each of the plurality of retrieved sentences and the query is determined by the search engine, wherein each similarity is determined as a function of a linguistic weight of a term in the query. The search engine then ranks the plurality of retrieved sentences based upon the determined similarities.
    Type: Application
    Filed: September 19, 2002
    Publication date: March 25, 2004
    Inventors: Ming Zhou, Hua Wu, Yue Zhang, Jianfeng Gao, Chang-Ning Huang
  • Publication number: 20040002848
    Abstract: The present invention performs machine translation by matching fragments of a source language sentence to be translated to source language portions of an example in example base. When all relevant examples have been identified in the example base, the examples are subjected to phrase alignment in which fragments of the target language sentence in each example are aligned against the matched fragments of the source language sentence in the same example. A translation component then substitutes the aligned target language phrases from the matched examples for the matched fragments in the source language sentence.
    Type: Application
    Filed: June 28, 2002
    Publication date: January 1, 2004
    Inventors: Ming Zhou, Jin-Xia Huang, Chang Ning Huang, Wei Wang
  • Publication number: 20030014238
    Abstract: A system and method identify base noun phrases (baseNP) in a linguistic input. A part-of-speech tagger identifies N-best part-of-speech tag sequences corresponding to the linguistic input. A baseNP identifier identifies baseNPs in the linguistic input using a unified statistical model that identifies the baseNPs, given the N-best POS sequences.
    Type: Application
    Filed: June 4, 2001
    Publication date: January 16, 2003
    Inventors: Endong Xun, Ming Zhou, Chang-Ning Huang