Patents Assigned to Verity, Inc.

Apparatus and method for parametric group processing

Patent number: 7461085

Abstract: A method of parametric group processing includes forming a parametric index from an indexed database. A first parametric group and a second parametric group corresponding to elements in the parametric index are specified. The first parametric group and the second parametric group are merged to produce a merged parametric group. A parametric result is extracted from the merged parametric group, where the parametric result specifies a set of documents.

Type: Grant

Filed: November 23, 2005

Date of Patent: December 2, 2008

Assignee: Verity, Inc.

Inventors: Neil Latarche, John Wang
System and method for automatically discovering a hierarchy of concepts from a corpus of documents

Patent number: 7085771

Abstract: The invention is a method, system and computer program for automatically discovering concepts from a corpus of documents and automatically generating a labeled concept hierarchy. The method involves extraction of signatures from the corpus of documents. The similarity between signatures is computed using a statistical measure. The frequency distribution of signatures is refined to alleviate any inaccuracy in the similarity measure. The signatures are also disambiguated to address the polysemy problem. The similarity measure is recomputed based on the refined frequency distribution and disambiguated signatures. The recomputed similarity measure reflects actual similarity between signatures. The recomputed similarity measure is then used for clustering related signatures. The signatures are clustered to generate concepts and concepts are arranged in a concept hierarchy. The concept hierarchy automatically generates query for a particular concept and retrieves relevant documents associated with the concept.

Type: Grant

Filed: May 17, 2002

Date of Patent: August 1, 2006

Assignee: Verity, Inc

Inventors: Christina Yip Chung, Jinhui Liu, Alpha Luk, Jianchang Mao, Sumit Taank, Vamsi Vutukuru
Method and system for naming a cluster of words and phrases

Patent number: 7031909

Abstract: The present invention provides a method, system and computer program for naming a cluster, or a hierarchy of clusters, of words and phrases that have been extracted from a set of documents. The invention takes these clusters as the input and generates appropriate labels for the clusters using a lexical database. Naming involves first finding out all possible word senses for all the words in the cluster, using the lexical database; and then augmenting each word sense with words that are semantically similar to that word sense to form respective definition vectors. Thereafter, word sense disambiguation is done to find out the most relevant sense for each word. Definition vectors are clustered into groups. Each group represents a concept. These concepts are thereafter ranked based on their support. Finally, a pre-specified number of words and phrases from the definition vectors of the dominant concepts are selected as labels, based on their generality in the lexical database.

Type: Grant

Filed: March 12, 2002

Date of Patent: April 18, 2006

Assignee: Verity, Inc.

Inventors: Jianchang Mao, Sumit Taank, Christina Chung, Alpha Luk
Apparatus and method for parametric group processing

Patent number: 6999971

Abstract: A method of parametric group processing includes forming a parametric index from an indexed database. A first parametric group and a second parametric group corresponding to elements in the parametric index are specified. The first parametric group and the second parametric group are merged to produce a merged parametric group. A parametric result is extracted from the merged parametric group, where the parametric result specifies a set of documents.

Type: Grant

Filed: May 8, 2001

Date of Patent: February 14, 2006

Assignee: Verity, Inc.

Inventors: Neil Latarche, John Wang
Method and apparatus for determining classifier features with minimal supervision

Patent number: 6910026

Abstract: A method of identifying features for a classifier includes identifying a set of elements that share a common characteristic, and then identifying a subset of elements within that set which share a second characteristic. Features are then selected that are more commonly possessed by the elements in the subset than the elements in the set but excluding the subset, and that are more commonly possessed by the elements in the set but excluding the subset, as compared to the elements outside the set. A further method of identifying features for a classifier includes defining a list of features, selecting a first feature from that list, identifying a set of elements that possess that first feature, and then identifying a subset of elements within that set which possess any other feature.

Type: Grant

Filed: August 27, 2001

Date of Patent: June 21, 2005

Assignee: Verity, Inc.

Inventor: Alpha Kamchiu Luk
System and method for ranking hyperlinked documents based on a stochastic backoff processes

Patent number: 6792419

Abstract: A system and method for ranking hyperlinked documents, such as web pages, is provided wherein a stochastic backoff process is used to rank those hyperlinked documents. In more detail, the stochastic process is derived from a random walk through the pages of the web. First, a directed graph may be generated from a crawl wherein the nodes are documents in the crawl and a directed edge from one node A to another node B indicates the presence of a hyperlink from the corresponding document docA to document docB. Using a stochastic backoff process on this graph, a weight between 0 and 1 is assigned to each document so that the documents may be ranked according to the weights.

Type: Grant

Filed: October 30, 2000

Date of Patent: September 14, 2004

Assignee: Verity, Inc.

Inventor: Prabhakar Raghavan
Method and apparatus for hierarchically decomposed bot scripts

Patent number: 6754647

Abstract: Method and apparatus are disclosed for the development and implementation of virtual robot's (bot's) directed natural language interaction with computer users. Bots employing the present invention base natural language interaction on a predefined universe of discourse that is decomposed hierarchically into domains. A data structure provides a storage area for each domain. The data structure may reflect the hierarchical decomposition. Domain topics containing program code directing the bot's interaction are placed in domain storage areas. Pattern lists associate words expected to be “heard” by the bot with particular domain topics. Domain topics are provided, as appropriate, to direct a user's attention toward the instant domain's parent, siblings, or children, with lower topics in the hierarchy getting higher preference. Domain censoring and domain tiebreakers improve usability.

Type: Grant

Filed: September 26, 2000

Date of Patent: June 22, 2004

Assignee: Verity, Inc.

Inventors: Walter Tackett, John B. Hodges, Scott Benson, D. Patrick Blair, Kate Boynton, Ray Dillinger, Martin Eggenberger, Tom Schofield
Apparatus and method for adaptively ranking search results

Patent number: 6738764

Abstract: A method of ranking search results includes producing a relevance score for a document in view of a query. A similarity score is calculated for the query utilizing a feature vector that characterizes attributes and query words associated with the document. A rank value is assigned to the document based upon the relevance score and the similarity score.

Type: Grant

Filed: May 8, 2001

Date of Patent: May 18, 2004

Assignee: Verity, Inc.

Inventors: Jianchang Mao, Mani Abrol, Rajat Mukherjee, Michel Tourn, Prabhakar Raghavan
Method and apparatus for merging result lists from multiple search engines

Patent number: 6728704

Abstract: This invention includes the step of transmitting a query to a set of search engines. Any result lists returned from these search engines is received, and a subset of entries in each result list is selected. Each entry in this subset is assigned a scoring value according to a scoring function, and each result list is then assigned a representative value according to the scoring values assigned to its entries. A merged list of entries is produced based upon the representative value assigned to each result list.

Type: Grant

Filed: August 27, 2001

Date of Patent: April 27, 2004

Assignee: Verity, Inc.

Inventors: Jianchang Mao, Rajat Mukherjee, Prabhakar Raghavan, Panayiotis Tsaparas
System and method for automatically discovering a hierarchy of concepts from a corpus of documents

Publication number: 20030217335

Abstract: The invention is a method, system and computer program for automatically discovering concepts from a corpus of documents and automatically generating a labeled concept hierarchy. The method involves extraction of signatures from the corpus of documents. The similarity between signatures is computed using a statistical measure. The frequency distribution of signatures is refined to alleviate any inaccuracy in the similarity measure. The signatures are also disambiguated to address the polysemy problem. The similarity measure is recomputed based on the refined frequency distribution and disambiguated signatures. The recomputed similarity measure reflects actual similarity between signatures. The recomputed similarity measure is then used for clustering related signatures. The signatures are clustered to generate concepts and concepts are arranged in a concept hierarchy. The concept hierarchy automatically generates query for a particular concept and retrieves relevant documents associated with the concept.

Type: Application

Filed: May 17, 2002

Publication date: November 20, 2003

Applicant: Verity, Inc.

Inventors: Christina Yip Chung, Jinhui Liu, Alpha Luk, Jianchang Mao, Sumit Taank, Vamsi Vutukuru
Method and system for naming a cluster of words and phrases

Publication number: 20030177000

Abstract: The present invention provides a method, system and computer program for naming a cluster, or a hierarchy of clusters, of words and phrases that have been extracted from a set of documents. The invention takes these clusters as the input and generates appropriate labels for the clusters using a lexical database. Naming involves first finding out all possible word senses for all the words in the cluster, using the lexical database; and then augmenting each word sense with words that are semantically similar to that word sense to form respective definition vectors. Thereafter, word sense disambiguation is done to find out the most relevant sense for each word. Definition vectors are clustered into groups. Each group represents a concept. These concepts are thereafter ranked based on their support. Finally, a pre-specified number of words and phrases from the definition vectors of the dominant concepts are selected as labels, based on their generality in the lexical database.

Type: Application

Filed: March 12, 2002

Publication date: September 18, 2003

Applicant: Verity, Inc.

Inventors: Jianchang Mao, Sumit Taank, Christina Chung, Alpha Luk
Automatic network load balancing using self-replicating resources

Publication number: 20030167295

Abstract: The present invention provides a method, system and computer program to balance the computational and network load in networked computers using self-replicating programs, referred to as symbionts. The method presented here reduces hotspots by encapsulating a resource in a symbiont, and having a user access that symbiont through programs that host symbionts, referred to as hosts. When a host accesses a symbiont, it may replicate a copy of that symbiont resource on itself or may be redirected to some other replicate of the same symbiont. The host then offers the replicated resource on the network to alleviate the load experienced by the original symbiont's computer. If the load on a symbiont falls below a threshold, it is removed from the host on which it was hosted.

Type: Application

Filed: March 1, 2002

Publication date: September 4, 2003

Applicant: Verity, Inc.

Inventor: Kiam Choo
Graphical search results system and method

Patent number: 6567103

Abstract: A system and method of creating a graphical presentation, such as a video, based on surfing the results of a web search. The graphical presentation may be constructed from the results of a search wherein each search result represents a URL and each URL is rendered as a graphical image of a web page (a frame) and stored in a file. When the file is viewed, it is displayed in a sequence of rendered frames wherein each frame is displayed for a variable, predetermined amount of time based on the relevance of the particular search result.

Type: Grant

Filed: August 2, 2000

Date of Patent: May 20, 2003

Assignee: Verity, Inc.

Inventor: Abdul Chaudhry
Application caching system and method

Patent number: 6457047

Abstract: An application caching system and method are provided wherein one or more applications may be cached throughout a distributed computer network. The system may include a central cache directory server, one or more distributed master application servers and one or more distributed application cache servers. The system may permit a service, such as a search, to be provided to the user more quickly.

Type: Grant

Filed: May 8, 2000

Date of Patent: September 24, 2002

Assignee: Verity, Inc.

Inventors: Ashok Chandra, Neil LaTarche, Jianchang Mao, Prabhakar Raghavan
Evaluation of content of a data set using multiple and/or complex queries

Patent number: 5778364

Abstract: The invention enables evaluation of the content of a set of data to determine whether the data set satisfies one or more queries. The invention enables rapid evaluation of large numbers of data sets much more rapidly than has previously been possible, even when the number of queries is large and/or the queries are complex. The queries are evaluated using an execution plan of query terms that is constructed from one or more specified queries by translating each query term of each query into one or more evidence descriptors and one or more combination operators, and operably relating each of the combination operators to at least one of the evidence descriptors or other combination operators, such that each query is defined by one or more of the evidence descriptors and one or more of the combination operators that are operably related to each other. Preferably, none of the evidence descriptors or combination operators are duplicated in the execution plan.

Type: Grant

Filed: January 2, 1996

Date of Patent: July 7, 1998

Assignee: Verity, Inc.

Inventor: Philip C. Nelson