Patents by Inventor Alpha Luk

Alpha Luk has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method for automatically discovering a hierarchy of concepts from a corpus of documents

Patent number: 7085771

Abstract: The invention is a method, system and computer program for automatically discovering concepts from a corpus of documents and automatically generating a labeled concept hierarchy. The method involves extraction of signatures from the corpus of documents. The similarity between signatures is computed using a statistical measure. The frequency distribution of signatures is refined to alleviate any inaccuracy in the similarity measure. The signatures are also disambiguated to address the polysemy problem. The similarity measure is recomputed based on the refined frequency distribution and disambiguated signatures. The recomputed similarity measure reflects actual similarity between signatures. The recomputed similarity measure is then used for clustering related signatures. The signatures are clustered to generate concepts and concepts are arranged in a concept hierarchy. The concept hierarchy automatically generates query for a particular concept and retrieves relevant documents associated with the concept.

Type: Grant

Filed: May 17, 2002

Date of Patent: August 1, 2006

Assignee: Verity, Inc

Inventors: Christina Yip Chung, Jinhui Liu, Alpha Luk, Jianchang Mao, Sumit Taank, Vamsi Vutukuru
Method and system for naming a cluster of words and phrases

Patent number: 7031909

Abstract: The present invention provides a method, system and computer program for naming a cluster, or a hierarchy of clusters, of words and phrases that have been extracted from a set of documents. The invention takes these clusters as the input and generates appropriate labels for the clusters using a lexical database. Naming involves first finding out all possible word senses for all the words in the cluster, using the lexical database; and then augmenting each word sense with words that are semantically similar to that word sense to form respective definition vectors. Thereafter, word sense disambiguation is done to find out the most relevant sense for each word. Definition vectors are clustered into groups. Each group represents a concept. These concepts are thereafter ranked based on their support. Finally, a pre-specified number of words and phrases from the definition vectors of the dominant concepts are selected as labels, based on their generality in the lexical database.

Type: Grant

Filed: March 12, 2002

Date of Patent: April 18, 2006

Assignee: Verity, Inc.

Inventors: Jianchang Mao, Sumit Taank, Christina Chung, Alpha Luk
System and method for automatically discovering a hierarchy of concepts from a corpus of documents

Publication number: 20030217335

Abstract: The invention is a method, system and computer program for automatically discovering concepts from a corpus of documents and automatically generating a labeled concept hierarchy. The method involves extraction of signatures from the corpus of documents. The similarity between signatures is computed using a statistical measure. The frequency distribution of signatures is refined to alleviate any inaccuracy in the similarity measure. The signatures are also disambiguated to address the polysemy problem. The similarity measure is recomputed based on the refined frequency distribution and disambiguated signatures. The recomputed similarity measure reflects actual similarity between signatures. The recomputed similarity measure is then used for clustering related signatures. The signatures are clustered to generate concepts and concepts are arranged in a concept hierarchy. The concept hierarchy automatically generates query for a particular concept and retrieves relevant documents associated with the concept.

Type: Application

Filed: May 17, 2002

Publication date: November 20, 2003

Applicant: Verity, Inc.

Inventors: Christina Yip Chung, Jinhui Liu, Alpha Luk, Jianchang Mao, Sumit Taank, Vamsi Vutukuru
Method and system for naming a cluster of words and phrases

Publication number: 20030177000

Abstract: The present invention provides a method, system and computer program for naming a cluster, or a hierarchy of clusters, of words and phrases that have been extracted from a set of documents. The invention takes these clusters as the input and generates appropriate labels for the clusters using a lexical database. Naming involves first finding out all possible word senses for all the words in the cluster, using the lexical database; and then augmenting each word sense with words that are semantically similar to that word sense to form respective definition vectors. Thereafter, word sense disambiguation is done to find out the most relevant sense for each word. Definition vectors are clustered into groups. Each group represents a concept. These concepts are thereafter ranked based on their support. Finally, a pre-specified number of words and phrases from the definition vectors of the dominant concepts are selected as labels, based on their generality in the lexical database.

Type: Application

Filed: March 12, 2002

Publication date: September 18, 2003

Applicant: Verity, Inc.

Inventors: Jianchang Mao, Sumit Taank, Christina Chung, Alpha Luk

System and method for automatically discovering a hierarchy of concepts from a corpus of documents

Method and system for naming a cluster of words and phrases

System and method for automatically discovering a hierarchy of concepts from a corpus of documents

Method and system for naming a cluster of words and phrases