Patents Assigned to Engenium Corporation
  • Patent number: 6847966
    Abstract: A term-by-document matrix is compiled from a corpus of documents representative of a particular subject matter that represents the frequency of occurrence of each term per document. A weighted term dictionary is created using a global weighting algorithm and then applied to the term-by-document matrix forming a weighted term-by-document matrix. A term vector matrix and a singular value concept matrix are computed by singular value decomposition of the weighted term-document index. The k largest singular concept values are kept and all others are set to zero thereby reducing to the concept dimensions in the term vector matrix and a singular value concept matrix. The reduced term vector matrix, reduced singular value concept matrix and weighted term-document dictionary can be used to project pseudo-document vectors representing documents not appearing in the original document corpus in a representative semantic space.
    Type: Grant
    Filed: April 24, 2002
    Date of Patent: January 25, 2005
    Assignee: Engenium Corporation
    Inventors: Matthew S. Sommer, Kevin B. Thompson