Abstract: Automatic extraction of objects in a presentation-oriented document comprises receiving the presentation-oriented document (POD) in which content elements are spatially arranged in a given layout organization for presenting contents to human users; receiving a set of descriptors that semantically define the objects to extract from the POD based on attributes comprising the objects; using the set of descriptors to identify content elements in the POD that match the attributes in the set of descriptors defining the objects, and assigning semantic annotations to the identified elements based on the descriptors; creating a semantic and spatial document model (SSDM) containing spatial structures of the identified content elements in the POD and the semantic annotations assigned to the identified contents elements; extracting the identified content elements from the POD based on the set of descriptors and the SSDM to create a set of object instances; and performing at least one of: i) using the object instances to
Abstract: Automatic extraction of objects in a presentation-oriented document comprises receiving the presentation-oriented document (POD) in which content elements are spatially arranged in a given layout organization for presenting contents to human users; receiving a set of descriptors that semantically define the objects to extract from the POD based on attributes comprising the objects; using the set of descriptors to identify content elements in the POD that match the attributes in the set of descriptors defining the objects, and assigning semantic annotations to the identified elements based on the descriptors; creating a semantic and spatial document model (SSDM) containing spatial structures of the identified content elements in the POD and the semantic annotations assigned to the identified contents elements; extracting the identified content elements from the POD based on the set of descriptors and the SSDM to create a set of object instances; and performing at least one of: i) using the object instances to