Document Structures And Storage, E.g., Html Extensions, Etc. (epo) Patents (Class 707/E17.118)
  • Patent number: 7769579
    Abstract: A method and system of learning, or bootstrapping, facts from semi-structured text is described. Starting with a set of seed facts associated with an object, documents associated with the object are identified. The identified documents are checked to determine if each has at least a first predefined number of seed facts. If a document does have at least a first predefined number of seed facts, a contextual pattern associated with the seed facts is identified and other instances of content in the document matching the contextual pattern are identified. If the document includes at least a second predefined number of the other instances of content matching the contextual pattern, then facts may be extracted from the other instances.
    Type: Grant
    Filed: May 31, 2005
    Date of Patent: August 3, 2010
    Assignee: Google Inc.
    Inventors: Shubin Zhao, Jonathan T. Betz
  • Publication number: 20100169286
    Abstract: Methods, systems, and products are disclosed for dynamically updating web content using W3C standards. One such method sends a request to a web server for a web page. A web browser receives and renders a static HTML web page. The web browser periodically sends a query to the web server and, in response, receives a latest date and time stamp indicating the latest update to the web page. The web browser compares the latest date and time stamp to a previously stored date and time stamp representing a previous update. If the latest date and time stamp matches the previously stored date and time stamp, then no update has occurred and no update is required. If, however, the date and time stamps do not match, then the web page has changed since the previous update and the web browser retrieves the latest update to the web page.
    Type: Application
    Filed: March 12, 2010
    Publication date: July 1, 2010
    Inventor: Keith Hackworth
  • Publication number: 20100161605
    Abstract: A computer-implemented method is disclosed for determining a type of landing page to which to transfer web searchers that enter a particular query, the method comprising: classifying a landing page as one of a plurality of landing page classes with a trained classifier of a computer based on textual content of the landing page; determining, by the computer, characteristics of one or more query to be associated with the landing page; and choosing, with the computer, whether to retain or to change classification of the landing page to be associated with the one or more query based on relative average conversion rates of advertisements on a plurality of manually-classified landing pages when associated with the characteristics of the one or more query.
    Type: Application
    Filed: December 23, 2008
    Publication date: June 24, 2010
    Applicant: Yahoo! Inc.
    Inventors: Evgeniy Gabrilovich, Andrei Broder, Bo Pang, Vanja Josifovski, Hila Becker
  • Publication number: 20100146381
    Abstract: The present invention provides a method of establishing a plain text document from a HTML document. The method including the steps of (A) acquiring a HTML document defined by HTML elements, each composed of tags and content between the tags; (B) pre-processing the HTML document by omitting some of the tags (including the content between those tags), whereby the rest of the HTML document comprises at least one target tag (including content between the target tags); (C) using a data structure to store the remaining tags of the pre-processed HTML document; (D) grouping the remaining tags (including the content between the remaining tags) stored in the data structure of the pre-processed HTML document into at least one target group according to the target tag(s); and (E) identifying the target group(s) most related to a title of the HTML document by comparing correlation(s) between the target group(s) and the title, and establishing a plain text document having the content of the identified target group.
    Type: Application
    Filed: December 1, 2009
    Publication date: June 10, 2010
    Applicant: ESOBI INC.
    Inventors: HONG-YANG TSAI, CHI-HAU HUNG
  • Publication number: 20100100522
    Abstract: Term negotiation can utilize centralized systems accessed via web interfaces for purposes such as mediation of communications between buyers and sellers, maintenance of a history of negotiations, and notification of parties regarding changes suggested during negotiation. Changes to terms proposed by parties using centralized systems can be stored in a data warehouse, potentially along with timestamp and identification information.
    Type: Application
    Filed: October 16, 2009
    Publication date: April 22, 2010
    Inventors: Gregory Austin Allison, Matthew Allan Vorst
  • Publication number: 20090157657
    Abstract: Signature schema documents, pre-defined in a query language, provide one or more instructions for application by an engine to transcode web pages of respective web sites. The instructions identify a web page family for the web page and extract a subset of data from the web page using one or more signatures previously identified within web pages of the same web page family (e.g. in accordance with a shared template for each family) of the web site. The instructions may include one or more directional references relative to the signatures to locate and extract the subset of data within the web page. Signatures may comprise text strings within the code of the web page and the directional references indicate positions of respective data relative to the location of the text strings. Transcoding may facilitate use of e-commerce web sites by wireless mobile devices.
    Type: Application
    Filed: May 12, 2008
    Publication date: June 18, 2009
    Inventors: Sang-Heun Kim, Charles Laurence Stinson
  • Publication number: 20090106296
    Abstract: The present invention relates to the field of computer software. More specifically, the present invention relates to methods of assisting aggregation of form-enabled web services. Systems and methods for handling the submission of user data into a plurality of form-enabled web sites are disclosed. The improved system allows for the presentation of a unified user interface, pre-filling of forms in order to increase user efficiency, and a fully automatic interface to the aggregated form-enabled web services.
    Type: Application
    Filed: October 19, 2007
    Publication date: April 23, 2009
    Inventors: David Jonathan Sickmiller, Jonathan Leighton Brown
  • Publication number: 20090099919
    Abstract: An exemplary embodiment of the present invention sets forth a system, method and/or computer program product which may include a graphical user interface (GUI) application embodied on a computer readable medium, which when executed on a processor performs a method. The method may include receiving a playlist may include a plurality of content of a plurality of different formats; and enabling a presenter to seamlessly deliver a presentation of the plurality of content to an audience.
    Type: Application
    Filed: July 18, 2008
    Publication date: April 16, 2009
    Applicant: Freepath, Inc.
    Inventors: John C. Schultheiss, Louis C. Douros, Adrian R. Pell, Kathryn M. Manley, John D. Stone, Jacob W. Jorgensen
  • Publication number: 20090049062
    Abstract: Techniques are described for organizing structurally similar web pages for a website. Fingerprints are made of the structure of the web pages using shingling by placing the web page's HTML tags and attributes in sequence and encoding the tags and attributes using a standard encoding technique. Fixed-size portions of the encoded sequence are taken and a set of values extracted using independent hash functions to compute the shingles. Alternatively, a DOM tree representation of HTML of the web page is generated and each path of the DOM tree encoded and values extracted using independent hash functions to compute the shingles. A specified number of shingles are retained as the fingerprint. The pages are then clustered based upon the URL and the similarity of the shingles. The clustered hierarchal organization of pages is further pruned by various criteria including similarity of shingles or support of the cluster node in the hierarchy.
    Type: Application
    Filed: August 14, 2007
    Publication date: February 19, 2009
    Inventors: Krishna Prasad Chitrapura, Krishna Leela Poola
  • Publication number: 20080275868
    Abstract: A web application and a method for creating complex query strings for conducting searches in through at least one database comprising structured documents that are structured in content-fields. The application comprises a GUI with an interactive table enabling users to insert search words, where the user can define the relations between at least some of the search words by the words in the interactive table. The searches through the structured documents' database may be conducted according to the search words and he relations between them. Additionally, the application may allow the user to associate content-fields with at least some of the search words and conduct the searches in the content-fields defined for each search word.
    Type: Application
    Filed: May 1, 2008
    Publication date: November 6, 2008
    Inventor: Yoram Zer
  • Publication number: 20080040337
    Abstract: Method for ordering nodes within hierarchical data. The concept of isolated ordered regions to maintain coordinates of nodes is used by associating each node with coordinates relative to a containing region. Modifications to nodes within a region only affect the nodes in that region, and not nodes in other regions. Traversals that retrieve information from the nodes can rebase the coordinates from their containing region and return with a total order.
    Type: Application
    Filed: August 13, 2007
    Publication date: February 14, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Edison Ting, James Kleewein
  • Publication number: 20070266050
    Abstract: The present invention provides a method of rendering document data compliant with an KM-based mark-up language, comprising the steps of: fetching the document data) parsing the document data into a document object model (DOM) representation so as to provide a tree structure, comprising nodes representative of the document data elements including tags and/or attributes; reconstructing the document object model (DOM) representation by replacing the nodes of pre-specified elements of said document data elements by one or more nodes comprising standard XML compliant elements having standard tags and attributes; rendering the document data with the reconstructed document object model (DOM) representation.
    Type: Application
    Filed: November 8, 2004
    Publication date: November 15, 2007
    Inventor: Gerardus Kaandorp