Patents by Inventor Dong Yun SIM

Dong Yun SIM has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9448999
    Abstract: A method for detecting similar documents includes extracting an entity from each of a first web document and a second web document; determining an importance contribution element corresponding to each of the web documents; calculating, using the processor, weights for each entity based on the determined importance contribution elements; and determining whether the web documents are similar documents based on the calculated weights. A device to detect similar documents includes a storage device; an entity extractor stored on the storage device and configured to extract an entity from a first web document and a second web document and to determine an importance contribution element from each of the web documents; a weight calculator configured to calculate weights of each entity based on the determined importance contribution elements; and a similar document detection unit configured to determine whether the web documents are similar documents based on the calculated weights.
    Type: Grant
    Filed: May 2, 2012
    Date of Patent: September 20, 2016
    Assignee: NHN Corporation
    Inventors: Chae Hyun Lee, Dong Yun Sim
  • Patent number: 9141697
    Abstract: The present disclosure relates to a method, system and software executable by a processor associated with non-transitory computer-readable storage medium for detecting a trap of web-based calendar pages and building a retrieval database. According to an aspect of the disclosure, detecting a trap of web-based calendar pages includes clustering, by a clustering module, URLs corresponding to web pages stored in a database according to a predetermined standard, generating a regular expression by analyzing a date pattern included in a clustering result, and detecting, a cluster suspected of being a trap of web-based perpetual calendar pages using the generated regular expression.
    Type: Grant
    Filed: June 2, 2011
    Date of Patent: September 22, 2015
    Assignee: NHN CORPORATION
    Inventors: Dong Yun Sim, Chaehyun Lee
  • Publication number: 20120284270
    Abstract: A method for detecting similar documents includes extracting an entity from each of a first web document and a second web document; determining an importance contribution element corresponding to each of the web documents; calculating, using the processor, weights for each entity based on the determined importance contribution elements; and determining whether the web documents are similar documents based on the calculated weights. A device to detect similar documents includes a storage device; an entity extractor stored on the storage device and configured to extract an entity from a first web document and a second web document and to determine an importance contribution element from each of the web documents; a weight calculator configured to calculate weights of each entity based on the determined importance contribution elements; and a similar document detection unit configured to determine whether the web documents are similar documents based on the calculated weights.
    Type: Application
    Filed: May 2, 2012
    Publication date: November 8, 2012
    Applicant: NHN CORPORATION
    Inventors: Chae Hyun LEE, Dong Yun SIM
  • Publication number: 20110320414
    Abstract: The present disclosure relates to a method, system and software executable by a processor associated with non-transitory computer-readable storage medium for detecting a trap of web-based calendar pages and building a retrieval database. According to an aspect of the disclosure, detecting a trap of web-based calendar pages includes clustering, by a clustering module, URLs corresponding to web pages stored in a database according to a predetermined standard, generating a regular expression by analyzing a date pattern included in a clustering result, and detecting, a cluster suspected of being a trap of web-based perpetual calendar pages using the generated regular expression.
    Type: Application
    Filed: June 2, 2011
    Publication date: December 29, 2011
    Applicant: NHN CORPORATION
    Inventors: Dong Yun SIM, Chaehyun LEE