Patents by Inventor Tang Xi Liu

Tang Xi Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8463598
    Abstract: Methods, systems, and apparatus, including computer program products, in which data from web documents are partitioned into a training corpus and a development corpus are provided. First word probabilities for words are determined for the training corpus, and second word probabilities for the words are determined for the development corpus. Uncertainty values based on the word probabilities for the training corpus and the development corpus are compared, and new words are identified based on the comparison.
    Type: Grant
    Filed: January 28, 2011
    Date of Patent: June 11, 2013
    Assignee: Google Inc.
    Inventors: Jun Wu, Tang Xi Liu, Feng Hong, YongGang Wang, Bo Yang, Lei Zhang
  • Patent number: 8386240
    Abstract: Methods, systems, and apparatus, including computer program products, to identify topic words in a collection of documents that includes topic documents related to a topic are disclosed. A reference topic word divergence value based on a document collection and the topic document collection is determined. A candidate topic word divergence value for a candidate topic word is determined based on the document collection and the topic document collection. The candidate topic word is determined to be a topic word if the candidate topic word divergence value is greater than the reference topic word divergence value.
    Type: Grant
    Filed: June 10, 2011
    Date of Patent: February 26, 2013
    Assignee: Google Inc.
    Inventors: Jun Wu, Tang Xi Liu, Feng Hong, Yong-Gang Wang, Bo Yang, Lei Zhang
  • Publication number: 20110238413
    Abstract: Methods, systems, and apparatus, including computer program products, to identify topic words in a collection of documents that includes topic documents related to a topic are disclosed. A reference topic word divergence value based on a document collection and the topic document collection is determined. A candidate topic word divergence value for a candidate topic word is determined based on the document collection and the topic document collection. The candidate topic word is determined to be a topic word if the candidate topic word divergence value is greater than the reference topic word divergence value.
    Type: Application
    Filed: June 10, 2011
    Publication date: September 29, 2011
    Applicant: Google Inc.
    Inventors: Jun Wu, Tang Xi Liu, Feng Hong, YongGang Wang, Bo Yang, Lei Zhang
  • Patent number: 7983902
    Abstract: Methods, systems, and apparatus, including computer program products, to identify topic words in a document corpus that includes topic documents related to a topic are disclosed. A reference topic word divergence value based on the document corpus and the topic document corpus is determined. A candidate topic word divergence value for a candidate topic word is determined based on the document corpus and the topic document corpus. The candidate topic word is determined to be a topic word if the candidate topic word divergence value is greater than the reference topic word divergence value.
    Type: Grant
    Filed: August 23, 2007
    Date of Patent: July 19, 2011
    Assignee: Google Inc.
    Inventors: Jun Wu, Tang Xi Liu, Feng Hong, Yonggang Wang, Bo Yang, Lei Zhang
  • Publication number: 20110137642
    Abstract: Methods, systems, and apparatus, including computer program products, in which data from web documents are partitioned into a training corpus and a development corpus are provided. First word probabilities for words are determined for the training corpus, and second word probabilities for the words are determined for the development corpus. Uncertainty values based on the word probabilities for the training corpus and the development corpus are compared, and new words are identified based on the comparison.
    Type: Application
    Filed: January 28, 2011
    Publication date: June 9, 2011
    Applicant: GOOGLE INC.
    Inventors: Jun Wu, Tang Xi Liu, Feng Hong, Yonggang Wang, Bo Yang, Lei Zhang
  • Patent number: 7917355
    Abstract: Methods, systems, and apparatus, including computer program products, in which data from web documents are partitioned into a training corpus and a development corpus are provided. First word probabilities for words are determined for the training corpus, and second word probabilities for the words are determined for the development corpus. Uncertainty values based on the word probabilities for the training corpus and the development corpus are compared, and new words are identified based on the comparison.
    Type: Grant
    Filed: August 23, 2007
    Date of Patent: March 29, 2011
    Assignee: Google Inc.
    Inventors: Jun Wu, Tang Xi Liu, Feng Hong, Yonggang Wang, Bo Yang, Lei Zhang
  • Publication number: 20090055168
    Abstract: Methods, systems, and apparatus, including computer program products, in which data from web documents are partitioned into a training corpus and a development corpus are provided. First word probabilities for words are determined for the training corpus, and second word probabilities for the words are determined for the development corpus. Uncertainty values based on the word probabilities for the training corpus and the development corpus are compared, and new words are identified based on the comparison.
    Type: Application
    Filed: August 23, 2007
    Publication date: February 26, 2009
    Applicant: GOOGLE INC.
    Inventors: Jun Wu, Tang Xi Liu, Feng Hong, Yonggang Wang, Bo Yang, Lei Zhang
  • Publication number: 20090055381
    Abstract: Methods, systems, and apparatus, including computer program products, to identify topic words in a document corpus that includes topic documents related to a topic are disclosed. A reference topic word divergence value based on the document corpus and the topic document corpus is determined. A candidate topic word divergence value for a candidate topic word is determined based on the document corpus and the topic document corpus. The candidate topic word is determined to be a topic word if the candidate topic word divergence value is greater than the reference topic word divergence value.
    Type: Application
    Filed: August 23, 2007
    Publication date: February 26, 2009
    Applicant: GOOGLE INC.
    Inventors: Jun Wu, Tang Xi Liu, Feng Hong, Yonggang Wang, Bo Yang, Lei Zhang
  • Publication number: 20080319738
    Abstract: A word corpus is identified and a word probability value is associated with each word in the word corpus. A sentence is identified, candidate segmentations of the sentence are determined based on the word corpus, and the associated probability value for each word in the word corpus is iteratively adjusted based on the probability values associated with the words and the candidate segmentations.
    Type: Application
    Filed: October 10, 2007
    Publication date: December 25, 2008
    Inventors: Tang Xi Liu, Xianping Ge