Patents Assigned to MSC Intellectual Properties B.V.
  • Patent number: 8504578
    Abstract: A system, method and computer program product for identifying near and exact-duplicate documents in a document collection, including for each document in the collection, reading textual content from the document; filtering the textual content based on user settings; determining N most frequent words from the filtered textual content of the document; performing a quorum search of the N most frequent words in the document with a threshold M; and sorting results from the quorum search based on relevancy. Based on the values of N and M near and exact-duplicate documents are identified in the document collection.
    Type: Grant
    Filed: August 16, 2012
    Date of Patent: August 6, 2013
    Assignee: MSC Intellectual Properties B.V.
    Inventors: Johannes C. Scholtes, Siebe Bloembergen
  • Patent number: 8250079
    Abstract: A system, method and computer program product for identifying near and exact-duplicate documents in a document collection, including for each document in the collection, reading textual content from the document; filtering the textual content based on user settings; determining N most frequent words from the filtered textual content of the document; performing a quorum search of the N most frequent words in the document with a threshold M; and sorting results from the quorum search based on relevancy. Based on the values of N and M near and exact-duplicate documents are identified in the document collection.
    Type: Grant
    Filed: March 30, 2011
    Date of Patent: August 21, 2012
    Assignee: MSC Intellectual Properties B.V.
    Inventors: Johannes C. Scholtes, Siebe Bloembergen
  • Patent number: 7930306
    Abstract: A system, method and computer program product for identifying near and exact-duplicate documents in a document collection, including for each document in the collection, reading textual content from the document; filtering the textual content based on user settings; determining N most frequent words from the filtered textual content of the document; performing a quorum search of the N most frequent words in the document with a threshold M; and sorting results from the quorum search based on relevancy. Based on the values of N and M near and exact-duplicate documents are identified in the document collection.
    Type: Grant
    Filed: April 30, 2008
    Date of Patent: April 19, 2011
    Assignee: MSC Intellectual Properties B.V.
    Inventors: Johannes C. Scholtes, Siebe Bloembergen