Patents by Inventor Yanhong Zhai

Yanhong Zhai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7765236
    Abstract: Systems and methods for extracting data content items from a web page are provided. A template is created by labeling data content items of interest associated with a web page and generating a template Document Object Model (DOM) tree based on the labeled web page. DOM trees are also generated for additional web pages that contain data content items for which extraction may be desired. These DOM trees are compared to the template DOM tree to determine alignment there between. The aligned data content items may then be extracted from the additional web pages and indexed, as desired. Labeling the data content items of interest prior to generating a template DOM tree allows for the desired data content items to be specified and more accurately extracted from related and/or similarly structured web pages.
    Type: Grant
    Filed: August 31, 2007
    Date of Patent: July 27, 2010
    Assignee: Microsoft Corporation
    Inventors: Yanhong Zhai, Yi Li, Richard Oian, Hong Gao, Lei Tan
  • Publication number: 20090063500
    Abstract: Systems and methods for extracting data content items from a web page are provided. A template is created by labeling data content items of interest associated with a web page and generating a template Document Object Model (DOM) tree based on the labeled web page. DOM trees are also generated for additional web pages that contain data content items for which extraction may be desired. These DOM trees are compared to the template DOM tree to determine alignment there between. The aligned data content items may then be extracted from the additional web pages and indexed, as desired. Labeling the data content items of interest prior to generating a template DOM tree allows for the desired data content items to be specified and more accurately extracted from related and/or similarly structured web pages.
    Type: Application
    Filed: August 31, 2007
    Publication date: March 5, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Yanhong Zhai, Yi Li, Richard Qian, Hong Gao, Lei Tan