Patents Assigned to Engenium Corporation - Justia Patents Search

Patents Assigned to Engenium Corporation

Method and system for optimally searching a document database using a representative semantic space

Patent number: 6847966

Abstract: A term-by-document matrix is compiled from a corpus of documents representative of a particular subject matter that represents the frequency of occurrence of each term per document. A weighted term dictionary is created using a global weighting algorithm and then applied to the term-by-document matrix forming a weighted term-by-document matrix. A term vector matrix and a singular value concept matrix are computed by singular value decomposition of the weighted term-document index. The k largest singular concept values are kept and all others are set to zero thereby reducing to the concept dimensions in the term vector matrix and a singular value concept matrix. The reduced term vector matrix, reduced singular value concept matrix and weighted term-document dictionary can be used to project pseudo-document vectors representing documents not appearing in the original document corpus in a representative semantic space.

Type: Grant

Filed: April 24, 2002

Date of Patent: January 25, 2005

Assignee: Engenium Corporation

Inventors: Matthew S. Sommer, Kevin B. Thompson