Patents Assigned to Claritech
  • Patent number: 6205443
    Abstract: The present invention is a method and apparatus for retrieving information from a database. Initially, the documents within the database are divided into mutually exclusive subdocuments that generally correspond to paragraphs of text. The present invention further creates a second set of subdocuments that overlap adjacent paragraphs of text. In particular, the location of the overlapping subdocuments depends on the size of the initial paragraphs. This second set of overlapping subdocuments are scored just as the mutually exclusive subdocuments are scored. The scores from both the mutually exclusive and overlapping subdocuments are used in ranking the relevance of documents to a query. The use of both sets of subdocument scores improves the effectiveness of the scoring algorithm.
    Type: Grant
    Filed: January 4, 1999
    Date of Patent: March 20, 2001
    Assignee: Claritech Corporation
    Inventor: David A. Evans
  • Patent number: 6138114
    Abstract: The present invention is a method for operating a computer system to minimize the number of disk storage access operations used in creating an inverted database. This method divides a database into several smaller subdatabases. The documents of the subdatabases are decomposed into subdocuments. A postings list for each subdatabase is then created in which all the terms for the subdatabase are associated with the identity of each subdocument of the subdatabase in which the terms occur. The resulting postings lists for the subdatabases are then merged. The merge process sorts the postings of the subdatabases and merges common terms. The non-common terms are merged after the common terms. The process of sorting the postings list and then merging the common terms followed by the non-common terms minimizes the number of disk storage access operations required for creating the inverted database from a series of inverted subdatabases.
    Type: Grant
    Filed: July 9, 1999
    Date of Patent: October 24, 2000
    Assignee: Claritech Corporation
    Inventor: Michael L. Horowitz
  • Patent number: 6134562
    Abstract: The present invention is a computer system for modifying a database which comprises a computer that modifies records stored in a database. In the process for modifying records in the database, addresses to memory locations in a disk storage unit are accessed during the commit phase by first checking the address space in a transaction log. The computer system of the present invention operates by committing transactions without locking out readers. This is possible because any changed data in the database is reflected in the transaction log and the log must be accessed prior to reading from the disk storage unit. As a result, the user sees changed data when the log is accessed, or if data has not been changed, the log merely directs the computer to the address in the original database storage where unchanged data is stored.
    Type: Grant
    Filed: September 3, 1999
    Date of Patent: October 17, 2000
    Assignee: Claritech Corporation
    Inventors: Michael L. Horowitz, Michael J. McInerny, Stewart M. Clamen
  • Patent number: 6115706
    Abstract: In a novel approach for retrieving information a set of sub-documents first is established based upon a set of documents. A query is processed which operates on the set of sub-documents, causing a score to be generated for each sub-document. The score for each sub-document is indicative of the relevance of the corresponding sub-document to the query. The scores are reviewed and the best sub-document is retrieved. According to one aspect of the invention, the best sub-document has a score that indicates the highest relevance between the sub-document and the query. According to another aspect of the invention, in response to a user selection, the next best sub-document is identified and retrieved. The sub-documents are also presented to the user in an order based upon the scores. According to another aspect of the invention, the document containing the sub-document having the best score is displayed and automatically scrolled to the location of the sub-document having the best score.
    Type: Grant
    Filed: September 3, 1999
    Date of Patent: September 5, 2000
    Assignee: Claritech Corporation
    Inventor: David A. Evans
  • Patent number: 6112204
    Abstract: The present invention provides a method and apparatus for generating a database search result. The creation of the search result is achieved by representing the subdocument lists of an inverted database with encoded bit strings. The encoded bit strings are space efficient methods of storing the correspondence between terms in the database and their occurrence in subdocuments. Logical combinations of these bit strings are then obtained by identifying the intersection, union, and/or inversion of a plurality of the bit strings. Since keywords for a database search can be identified by selecting the terms of the inverted database, the logical combinations of bit strings represent search results over the database. This technique for method for generating a search result is computationally efficient because computers combine bit strings very efficiently. Also, the search elements of the present invention are not just limited to keywords. The search elements also include types of fields (e.g.
    Type: Grant
    Filed: December 2, 1998
    Date of Patent: August 29, 2000
    Assignee: Claritech Corporation
    Inventor: Michael L. Horowitz
  • Patent number: 6055528
    Abstract: The present invention provides a method and apparatus for retrieving documents that are stored in a language other than the language that is used to formulate a search query. This invention decomposes the query into terms and then translates each of the terms into terms of the language of the database. Once the database language terms have been listed, a series of subqueries is formed by creating all the possible combinations of the listed terms. Each subquery is then scored on each of the documents in the target language database. Only those subqueries that return meaningful scores are relevant to the query. Thus, the semantic meaning of the query is determined against the database itself and those documents in the database language that are most relevant to that semantic meaning are returned.
    Type: Grant
    Filed: July 25, 1997
    Date of Patent: April 25, 2000
    Assignee: Claritech Corporation
    Inventor: David A. Evans
  • Patent number: 6029167
    Abstract: A method and apparatus for retrieving similar or identical textual passages among different documents is disclosed. Normal discourse structures along with textual content attributes are used to encode a known passage with "marker sequences" that give a characterizing "signature" to the passage. The encoded known passage is then evaluated against similarly encoded passages appearing in a database of documents. If it is determined that there is a possible match between the encoded known passage and an encoded passage in a database document, a sequential string search is performed to determine whether the two passages are likely to be similar or identical. If the sequential string search records a probable match between the known passage and the database passage, the database passage is displayed for further review.
    Type: Grant
    Filed: July 25, 1997
    Date of Patent: February 22, 2000
    Assignee: Claritech Corporation
    Inventor: David A. Evans
  • Patent number: 5999925
    Abstract: In a novel approach for retrieving information a set of sub-documents first is established based upon a set of documents. A query is processed which operates on the set of sub-documents, causing a score to be generated for each sub-document. The score for each sub-document is indicative of the relevance of the corresponding sub-document to the query. The scores are reviewed and the best sub-document is retrieved. According to one aspect of the invention, the best sub-document has a score that indicates the highest relevance between the sub-document and the query. According to another aspect of the invention, in response to a user selection, the next best sub-document is identified and retrieved. The sub-documents are also presented to the user in an order based upon the scores. According to another aspect of the invention, the document containing the sub-document having the best score is displayed and automatically scrolled to the location of the sub-document having the best score.
    Type: Grant
    Filed: July 25, 1997
    Date of Patent: December 7, 1999
    Assignee: Claritech Corporation
    Inventor: David A. Evans
  • Patent number: 5995962
    Abstract: The present invention is a method for operating a computer system to minimize the number of disk storage access operations used in creating an inverted database. This method divides a database into several smaller subdatabases. The documents of the subdatabases are decomposed into subdocuments. A postings list for each subdatabase is then created in which all the terms for the subdatabase are associated with the identity of each subdocument of the subdatabase in which the terms occur. The resulting postings lists for the subdatabases are then merged. The merge process sorts the postings of the subdatabases and merges common terms. The non-common terms are merged after the common terms. The process of sorting the postings list and then merging the common terms followed by the non-common terms minimizes the number of disk storage access operations required for creating the inverted database from a series of inverted subdatabases.
    Type: Grant
    Filed: July 25, 1997
    Date of Patent: November 30, 1999
    Assignee: Claritech Corporation
    Inventor: Michael L. Horowitz
  • Patent number: 5987448
    Abstract: Document texts are produced by recognizing characters in document images by an Optical Character Recognition (OCR) process. When such a document text matches one or more search terms of a query, the corresponding document image is displayed. Regions of the document image, corresponding to words of the document text that match the search terms, are displayed in a visually distinctive manner. The display of the document image may be augmented by displaying a region corresponding to a reference text within the document text in another visually distinctive manner.
    Type: Grant
    Filed: July 25, 1997
    Date of Patent: November 16, 1999
    Assignee: Claritech Corporation
    Inventors: David A. Evans, Michael J. McInerny
  • Patent number: 5970483
    Abstract: An document image that is the source of Optical Character Recognition (OCR) output is displayed so that a user can select a region of the displayed document image. When the region is selected, text of the OCR output corresponding to the selected region is submitted as an input to a search engine.
    Type: Grant
    Filed: July 25, 1997
    Date of Patent: October 19, 1999
    Assignee: Claritech Corporation
    Inventor: David A. Evans
  • Patent number: 5953728
    Abstract: The present invention is a computer system for modifying a database which comprises a computer that modifies records stored in a database. In the process for modifying records in the database, addresses to memory locations in a disk storage unit are accessed during the commit phase by first checking the address space in a transaction log. The computer system of the present invention operates by committing transactions without locking out readers. This is possible because any changed data in the database is reflected in the transaction log and the log must be accessed prior to reading from the disk storage unit. As a result, the user sees changed data when the log is accessed, or if data has not been changed, the log merely directs the computer to the address in the original database storage where unchanged data is stored.
    Type: Grant
    Filed: July 25, 1997
    Date of Patent: September 14, 1999
    Assignee: Claritech Corporation
    Inventors: Michael L. Horowitz, Michael J. McInerny, Stewart M. Clamen
  • Patent number: 5926808
    Abstract: The system of the present invention provides for a method and apparatus of displaying portions of text from multiple documents over multiple databases related to a search query. The initial step in this method is to identify a search query. Based on this identification, a search against multiple databases is initiated. In particular, the computer system identifies auxiliary databases either within a network or between networks that are likely to contain documents relating to terms in the search query. Upon identification of these databases, the databases are then searched to identify those documents relating to the identified query. The various sets of identified documents from multiple databases are then returned and processed to create an ordered ranking for the returned documents. Text portions from the highest ranking documents across the multiple databases are then automatically displayed to the user.
    Type: Grant
    Filed: July 25, 1997
    Date of Patent: July 20, 1999
    Assignee: Claritech Corporation
    Inventors: David A. Evans, Michael J. McInerny
  • Patent number: 5907840
    Abstract: The present invention is a method and apparatus for retrieving information from a database. Initially, the documents within the database are divided into mutually exclusive subdocuments that generally correspond to paragraphs of text. The present invention further creates a second set of subdocuments that overlap adjacent paragraphs of text. In particular, the location of the overlapping subdocuments depends on the size of the initial paragraphs. This second set of overlapping subdocuments are scored just as the mutually exclusive subdocuments are scored. The scores from both the mutually exclusive and overlapping subdocuments are used in ranking the relevance of documents to a query. The use of both sets of subdocument scores improves the effectiveness of the scoring algorithm.
    Type: Grant
    Filed: July 25, 1997
    Date of Patent: May 25, 1999
    Assignee: Claritech Corporation
    Inventor: David A. Evans
  • Patent number: 5893094
    Abstract: The present invention provides a method and apparatus for generating a database search result. The creation of the search result is achieved by representing the subdocument lists of an inverted database with encoded bit strings. The encoded bit strings are space efficient methods of storing the occurrence correspondence between terms in the database and their occurrence in subdocuments. Logical combinations of these bit strings are then obtained by identifying the intersection, union, and/or inversion of a plurality of the bit strings. Since keywords for a database search can be identified by selecting the terms of the inverted database, the logical combinations of bit strings represent search results over the database. This technique for method for generating a search result is computationally efficient because computers combine bit strings very efficiently. Also, the search elements of the present invention are not just limited to keywords. The search elements also include types of fields (e.g.
    Type: Grant
    Filed: July 25, 1997
    Date of Patent: April 6, 1999
    Assignee: Claritech Corporation
    Inventor: Michael L. Horowitz