Abstract: A method of creating a new XML document having at least a root element and a declaration. The method comprises retrieving from storage a new fragment XML document comprising at least one XML template for a new XML file that itself has a root element. Then, at least one XML template is selected and the selected XML template is used to create an XML document. User and programmer interfaces, as well as device and system structures that can implement the method, also are provided.
Abstract: A method of propagating changes from a DOM tree. The method comprising watching a subset of the DOM tree, the subset matching a template. A change is made in the watched subset of the DOM tree. A representation of the DOM tree is automatically updated to reflect the change in the watched subset.
Abstract: A source DOM tree represents at least a part of a document. A destination DOM tree corresponds to the source DOM tree. At least one stealth node is provided in the destination DOM tree. The stealth node has no corresponding node in the source DOM tree. The stealth node is provided at a location corresponding to a location in the source DOM tree where an insertion of a new node is anticipated.
Abstract: Systems and methods providing computer implemented depiction encoding production constructed from one or more depictions, where, for each of one or more depictions, an encoding collection encoding a narrative account is chosen from the depiction, and where, for each chosen encoding collection, an encoding collection is established from the chosen encoding collection, where one or more expression styles from the chosen encoding collection may be replaced with different corresponding expression styles, and where a depiction encoding is assembled from the established encoding collections, such that the narrative account encoded in the assembled depiction encoding is comprised of the narrative accounts of the chosen encoding collections.
Abstract: Systems and methods providing automated implementation of production characteristics for a narrative event, where a presentation criterion encoding the production characteristics is interpreted and selects a set of algorithms to use to implement those production characteristics, and where those algorithms implement the production characteristics for the event content representing the narrative event as a presentation criterion production collection and a presentation criterion integration specification, such that for an event depiction from the event content integrated with the presentation criterion production collection according to the presentation criterion integration specification, that event depiction reflects the production characteristics.
Abstract: A method and system identifying the language of a textual passage is disclosed. The method and system includes parsing the textual passage into n-grams and assigning an initial weight to each n-gram, and adjusting the weight initially assigned to a word or n-gram parsed from the textual passage. The initially assigned weight is adjusted in a manner proportionate to the inverse of the number of languages within which such words or n-grams appear. Reducing the weight assigned to such words or n-grams diminishes—without completely eliminating—their importance in comparison to other words or n-grams parsed from the same textual passage when determining the language of a passage. The method and system of the present invention appropriately weighs the short words or n-grams common to multiple languages without affecting the short words or n-grams that are uncommon to several languages.
Type:
Grant
Filed:
January 14, 2004
Date of Patent:
April 15, 2008
Assignee:
Clairvoyance Corporation
Inventors:
Xiang Tong, Gregory T. Grefenstette, David A. Evans
Abstract: An information need can be modeled by a binary classifier such as support vector machine (SVM). SVMs can exhibit very conservative precision oriented behavior when modeling information needs. This conservative behavior can be overcome by adjusting the position of the hyperplane, the geometric representation of a SVM. The present invention describes a couple of automatic techniques for adjusting the position of an SVM model based upon a beta-gamma thresholding procedure, cross fold validation and retrofitting. This adjustment technique can also be applied to other types of learning strategies.
Type:
Grant
Filed:
April 12, 2004
Date of Patent:
April 8, 2008
Assignee:
Clairvoyance Corporation
Inventors:
James G. Shanahan, Norbert Roma, David A. Evans
Abstract: A method and system for generating a workflow graph from empirical data of a process are described. A processing system obtains data corresponding to multiple instances of a process, the process including a set of tasks, the data including information about order of occurrences of the tasks. The processing system analyzes the occurrences of the tasks to identify order constraints. The processing system partitions nodes representing tasks into subsets based upon the order constraints, wherein the subsets are sequence ordered with respect to each other such that all nodes associated with a given subset either precede or follow all nodes associated with another subset. The processing system partitions nodes representing tasks into subgroups, wherein each subgroup includes one or more nodes that occur without order constraints relative to nodes associated with other subgroups. A workflow graph representative of the process is constructed wherein nodes are connected by edges.
Abstract: A computer-readable medium comprises data structure for providing information about levels of similarity between pairs of N documents. The data structure comprises a plurality of entries of similarity values representing levels of similarity for a plurality of pairs of the documents. Each of the similarity values represents a level of similarity of one document of a given pair relative to the other document of the given pair. The similarity value of each entry is greater than a threshold similarity value that is greater than zero. The plurality of similarity-value entries are fewer than N2?N in number if the similarity values are asymmetric with regard to document pairing, and the plurality of similarity-value entries are fewer than N 2 - N 2 in number if the similarity values are symmetric with regard to document pairing. A method and apparatus for generating the data structure are described.
Type:
Application
Filed:
December 12, 2005
Publication date:
June 14, 2007
Applicant:
Clairvoyance Corporation
Inventors:
James Shanahan, Norbert Roma, David Evans
Abstract: A method for identifying clusters of similar documents from among a set of documents is described. A particular document is selected based on rank from among a ranked set of documents, wherein the ranked set of documents are included among available documents of the set of documents. A probe is generated based on the particular document. The probe comprising one or more features. Documents that satisfy a similarity condition are found from among the available documents using a search based upon the probe. Some or all documents found are associated with a particular cluster of documents. The process can be repeated to generate further clusters. The method can be implemented with a computer, and associated programming instructions can be contained within a compute readable carrier.
Type:
Application
Filed:
November 15, 2005
Publication date:
May 17, 2007
Applicant:
Clairvoyance Corporation
Inventors:
David Evans, Victor Sheftel, Jeffrey Bennett, David Hull
Abstract: A method for identifying clusters of similar documents from among a set of documents is described. A particular document is selected from among available documents of the set of documents, and a probe is generated based on the particular document. The probe comprises one or more features. Documents are found that satisfy a similarity condition using the probe from among the available documents. Some or all of the documents that satisfy the similarity condition are associated with a particular cluster of documents. The process can be repeated to generate further clusters. The method can be implemented with a computer, and associated programming instructions can be contained within a compute readable carrier.
Type:
Application
Filed:
November 15, 2005
Publication date:
May 17, 2007
Applicant:
Clairvoyance Corporation
Inventors:
David Evans, Victor Sheftel, Jeffrey Bennett
Abstract: A novel approach for filtering documents involves the use of delivery ratio threshold setting technique to set an initial profile score threshold and the use of beta-gamma regulation for dynamic threshold updating. A group of documents is scored pursuant to a user profile. The score for each document is indicative of the relevance of the corresponding document to the user profile. The score can be compared with a profile score threshold to decide if the document should be accepted or rejected. According to one aspect of the invention, the initial threshold is set to a score threshold that approximates an expected ratio of acceptable documents calibrated with respect to a set of reference documents. According to another aspect of the invention, the score threshold can be updated based on the accumulated example documents, user's relevance judgment, and the user's utility function. The accumulated example documents are first scored against a profile and a ranked list of scored documents is obtained.
Abstract: An document image that is the source of Optical Character Recognition (OCR) output is displayed so that a user can select a region of the displayed document image. When the region is selected, text of the OCR output corresponding to the selected region is submitted as an input to a search engine.