Abstract: An electronic document is parsed to remove irrelevant text and to identify the significant elements of the retained text. The elements are assigned scores representing their significance to the topical content of the document. A matrix of element-pairs is constructed such that the matrix nodes represent the result of one or more functions of the scores and other attributes of the paired elements. The resulting matrix is a compact representation of topical content that affords great precision in information retrieval applications that depend on measurements of the relatedness of topical content.