Parsing Data Structures And Data Objects Patents (Class 707/755)
  • Publication number: 20150066963
    Abstract: A method of structured event log data entry. A structured event log data entry algorithm is provided including a text parser which utilizes stored information including a vocabulary for a human operated process defining classes representing structured information including structured events and structured devices, linguistic patterns for plain text analysis, and relationships between the classes. A pattern proposer utilizes stored linking relations establishing links between the classes and the linguistic patterns. An operator in the system records a plain text event log entry describing an event that occurred in the system into a memory accessible by a processor implementing the algorithm. The text parser implements automatic pattern evaluating the log entry for a matching of any portion of any of the linguistic patterns. The pattern proposer presents a proposed structured event log entry including a structured event and/or structured device for the operator to review whenever the matching is successful.
    Type: Application
    Filed: August 29, 2013
    Publication date: March 5, 2015
    Applicant: Honeywell International Inc.
    Inventors: KAREL MACEK, VIT LIBAL, JAMES SCHREDER
  • Publication number: 20150066964
    Abstract: Provided is a knowledge extracting apparatus for extracting knowledge information related to a knowledge-extraction target from an electronic document distributed continually in a state where the electronic document is not associated with the knowledge-information extraction target. A knowledge extracting apparatus according to one embodiment is a knowledge extracting apparatus including: an information receiving section for receiving an electronic document; a knowledge extracting section for extracting a concept from the electronic document based on a target word to extract knowledge information and a clue word to extract knowledge information and forming knowledge information in which the concept thus extracted and the target word are associated with each other; a storage section for storing the knowledge information thus extracted; and an information analysis section for, after the knowledge information is stored, analyzing the electronic document based on the knowledge information in the storage section.
    Type: Application
    Filed: November 17, 2014
    Publication date: March 5, 2015
    Applicants: Kabushiki Kaisha Toshiba, Toshiba Solutions Corporation
    Inventors: Kyoko MAKINO, Shigeaki SAKURAI, Shigeru MATSUMOTO, Shozo ISOBE, Kazuyoshi NISHI, Yoshimi SAITO, Hiroyuki SUZUKI, Yoshinori MASAOKA
  • Patent number: 8972424
    Abstract: A system and related method for the electronic processing of text onto a two-dimensional coordinate system to analyze the attitudinal mindset associated with the text. The system and related method may also be employed to generate text based on a desired attitudinal mindset to impart. The system includes a computer system embodying functions that enable a user to analyze the text. The system includes one or more functions to parse attitudinal words and objective words and associate two-dimensional coordinates with the subjective words. The system further includes one or more functions for mapping the associated two-dimensional coordinates to show the geographic locations of each attitudinal word of the text in relation to each other attitudinal word of the text. The system decomposes attitudinal words into attitudinal equivalence and reference category and enables the generation of a report of the mindset associated with the analyzed text.
    Type: Grant
    Filed: November 9, 2012
    Date of Patent: March 3, 2015
    Inventor: Peter Snell
  • Patent number: 8972423
    Abstract: A system, method, and computer program for parsing a schema across a system to support interoperable machine-to-machine interaction over a network, comprising the steps of communicating a plurality of data in a data defining mark-up language file by a transport protocol stack; parsing said data defining mark-up language to determine at least one opaque schema element; and translating said at least one opaque schema element to a mark-up language string element and appropriate means and computer-readable instructions.
    Type: Grant
    Filed: September 26, 2006
    Date of Patent: March 3, 2015
    Assignee: Siemens Product Lifecycle Management Software Inc.
    Inventors: Puneet Vardhan, Ronald Marchi
  • Patent number: 8972413
    Abstract: Methods and comment association systems for associating one or more comments with one or more primary electronic documents are described. In one aspect, the method comprises: identifying, at a comment association system, one or more key terms from at least a portion of the one or more primary electronic documents; identifying, at the comment association system, one or more comments associated with the identified key terms; determining, at the comment association system, whether an identified comment is sufficiently related to the one or more primary electronic documents by calculating one or more relation score for that identified comment and comparing the relation score to one or more threshold; and if the identified comment is sufficiently related to the one or more primary electronic documents, then associating the identified comment with the one or more primary electronic documents at the comment association system.
    Type: Grant
    Filed: October 5, 2011
    Date of Patent: March 3, 2015
    Assignee: Rogers Communications Inc.
    Inventors: Hyun Chul Lee, Liqin Xu, Ke Zeng
  • Patent number: 8972372
    Abstract: Systems and methods are disclosed for receiving a first specification that identifies program code behavior associated with a plurality of documents. The specification includes an input-output pair with a first data entity and a second data entity. The systems and methods further include identifying one or more documents, within the plurality of documents, that are configured to (i) use at least a portion of the first data entity as an input to program code associated with particular ones of the documents, and (ii) provide at least a portion of the second data entity as output associated with the program code, wherein the particular ones of the documents correspond to a positive matching between one or more constraints associated with each document and one or more constraints associated with the specification, and generating search results comprising the identified one or more documents.
    Type: Grant
    Filed: April 17, 2013
    Date of Patent: March 3, 2015
    Assignee: NUtech Ventures
    Inventors: Sebastian Elbaum, Kathryn Stolee
  • Patent number: 8972425
    Abstract: A method is provided for parsing a document having a plurality of lines on which items are listed spanning one or more lines. It includes: obtaining a plurality of candidates, representing hypothetical items within the document, each candidate spanning one or more lines and having a local cost representing a confidence in a quality of the candidate compared to a model; determining labeling costs for intervals of the document defined between pairs of lines, each interval containing candidates therein, each labeling cost reflecting a configuration of the candidates within the interval; identifying a best labeling for each interval based on the labeling costs determined for that interval, the best labeling corresponding to one of the configurations of the candidates within the interval; defining a global objective function; and selecting a subset of the candidates such that the global objective function is optimized, based on the identified best labelings.
    Type: Grant
    Filed: March 13, 2013
    Date of Patent: March 3, 2015
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Christina Pavlopoulou, Evgeniy Bart, Eric Saund
  • Publication number: 20150058348
    Abstract: A first set of contextual dimensions is generated from one or more textual descriptions associated with a given event, which includes one or more examples. A second set of contextual dimensions is generated from one or more visual features associated with the given event, which includes one or more visual example recordings. A similarity structure is constructed from the first set of contextual dimensions and the second set of contextual dimensions. One or more of the textual descriptions is matched with one or more of the visual features based on the similarity structure.
    Type: Application
    Filed: August 26, 2013
    Publication date: February 26, 2015
    Inventors: Liangliang Cao, Yuan-Chi Chang, Quoc-Bao Nguyen
  • Publication number: 20150052160
    Abstract: A computer-implemented method includes receiving a request to search for other users who are associated with at least a threshold level of similarity to the requesting user; accessing information indicative of a patient profile of the requesting user; determining one or more attributes of the requesting user; searching a data repository for information indicative of a user associated with one or more attributes corresponding to at least one of the one or more attributes of the requesting user; identifying, based on searching, a user associated with one or more attributes corresponding to at least one of the one or more attributes of the requesting user; determining that the one or more corresponding attributes of the identified user satisfy the threshold level of similarity; and transmitting information indicative of the identified user, with the transmitted information specifying the identified user as being a peer of the requesting user.
    Type: Application
    Filed: August 15, 2013
    Publication date: February 19, 2015
    Applicant: Universal Research Solutions, LLC
    Inventor: Ali Adel Hussam
  • Patent number: 8959096
    Abstract: A data structure for storing items of information having a time of validity includes a validity interval for each item of information, and methods for making and using the same. The items of information are organized in a data structure having nodes and edges connecting the nodes. This data structure is preferably a directed acyclic graph structure. The data structure includes parent nodes and child nodes. The validity interval specified for any child node generally is contained within the validity interval specified for that child node's parent node, such that the data structure includes no child nodes with a validity interval that falls outside of the validity interval of its parent node.
    Type: Grant
    Filed: February 18, 2014
    Date of Patent: February 17, 2015
    Assignee: Barr Rosenberg
    Inventor: Barr Rosenberg
  • Patent number: 8959100
    Abstract: A system and method for Context Enhanced Mapping. A request is received from a user over a network for a map comprising an identification of a physical location, and at least one criteria. The physical location is mapped. Spatial, temporal, topical, and social data available to the network relating to the physical location and criteria is retrieved using a global index of data available to the network and prioritized for inclusion based upon the user and context of the request. The map of the physical location and at least some of the retrieved spatial, temporal, topical, and social data is displayed on a display medium.
    Type: Grant
    Filed: November 11, 2013
    Date of Patent: February 17, 2015
    Assignee: Yahoo! Inc.
    Inventors: Christopher William Higgins, Marc Eliot Davis, Ronald Martinez, Joseph James O'Sullivan, Christopher T. Paretti, Chris Kalaboukis, Athellina Athsani
  • Publication number: 20150046481
    Abstract: Computer-implemented systems and methods are disclosed for constructing a parser that parses complex data. In some embodiments, a method is provided for receiving a parser definition as an input to a parser generator and generating a parser at least in part from the parser definition. In some embodiments, the generated parser comprises two or more handlers forming a processing pipeline. In some embodiments, the parser receives as input a first string into the processing pipeline. In some embodiments, the parser generates a second string by a first handler and inputs the second string regeneratively into the parsing pipeline, if the first string matches an expression specified for the first handler in the parser definition.
    Type: Application
    Filed: October 28, 2014
    Publication date: February 12, 2015
    Inventor: Mark ELLIOT
  • Patent number: 8954454
    Abstract: Methods and apparatus are presented for aggregating data from disparate sources into an efficiently accessible format. For example, an aggregation tool may receive attribute-based data from one source and metrics-based data from another source. Given this data, the aggregation tool may store attribute data from the attribute-based data into a data object, where the data object includes multiple time slots corresponding to defined time ranges. The aggregation tool may then determine from the metrics-based data, respective metrics data for each of the multiple time slots of the data object, where each time slot is associated with the attribute data. The aggregation tool may store the respective metrics data into each of the multiple time slots of the data object. In this way, the data object may serve to efficiently provide an answer to a query requiring data from multiple data sources.
    Type: Grant
    Filed: October 12, 2012
    Date of Patent: February 10, 2015
    Assignee: Adobe Systems Incorporated
    Inventors: Nicholas J. Brown, David L. Cardon, Jason A. Carter
  • Patent number: 8954458
    Abstract: Systems and methods are provided for identifying unsolicited or unwanted electronic communications, such as spam. The disclosed embodiments also encompass systems and methods for selecting content items from a content item database. Consistent with certain embodiments, computer-implemented systems and methods may use a clustering based statistical content matching anti-spam algorithm to identify and filter spam. Such a anti-spam algorithm may be implemented to determine a degree of similarity between an incoming e-mail with a collection of one or more spam e-mails stored in a database. If the degree of similarity exceeds a predetermined threshold, the incoming e-mail may be classified as spam. Further, in accordance with other embodiments, systems and methods may be provided to determine a degree of similarity between a query or search string from a user and content items stored in a database.
    Type: Grant
    Filed: July 11, 2011
    Date of Patent: February 10, 2015
    Assignee: AOL Inc.
    Inventors: Santhosh Baramasagara Chandrasekharappa, Sivakumar Ekambaram, Saurabh Sohoney, Rakesh Nigam
  • Patent number: 8954455
    Abstract: A user saves a structured query defining connections between two or more objects maintained by a social networking system. The social networking system finds objects matching the structured query, either by periodically performing searches for new objects or by analyzing objects as they are added or modified. The user creating the saved query can subsequently view the matching objects.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: February 10, 2015
    Assignee: Facebook, Inc.
    Inventors: Ken Deeter, Thomas Stocky, Robyn David Morris
  • Patent number: 8954092
    Abstract: A computing system extracts, based on one or more electronic messages sent or received by a user of a mobile computing device, travel plan information associated with the user of the mobile computing device. The travel plan information may indicate a destination to which the user is planning to travel. In response to extracting the travel plan information, the computing system may send an instruction to the mobile computing device to cache, in advance of the user arriving at the destination, information associated with the destination. In this way, the mobile computing device may access the information associated with the destination while at the destination, even if the mobile computing device is unable to access the information via a wireless communication channel.
    Type: Grant
    Filed: October 22, 2012
    Date of Patent: February 10, 2015
    Assignee: Google Inc.
    Inventors: Andrew Kirmse, Dale Hawkins, Ronghui Zhu
  • Patent number: 8954453
    Abstract: In accordance with embodiments, there are provided mechanisms and methods for determining whether a developed application associated with an on-demand database service will operate properly with at least one other application. These mechanisms and methods for providing such determination can enable embodiments to ensure that new versions of developed applications will operate in the same application environment of a previous version. The ability of embodiments to make such determination may lead to an improved application migration development/runtime framework, etc.
    Type: Grant
    Filed: February 16, 2012
    Date of Patent: February 10, 2015
    Assignee: salesforce.com, inc.
    Inventor: Craig Weissman
  • Patent number: 8954434
    Abstract: The present technology is related to identifying, from within a corpus of documents, a subject (e.g., person, location, date, etc.) that is relevant to a topic and that is usable to enhance a topic-describing document. Documents within the corpus of documents share a link structure, such that some documents include hyperlinks that enable navigation to the topic-describing document, and the topic-describing document includes hyperlinks that enable navigation to other documents. Text of documents within the corpus is parsed to identify the subject, and a context of the subject suggests a degree of relevance of the subject to the topic. An enhancement type of the subject is determined, and a version of the topic-describing document is enhanced to include a presentation of the subject.
    Type: Grant
    Filed: January 8, 2010
    Date of Patent: February 10, 2015
    Assignee: Microsoft Corporation
    Inventors: David Dongjah Ahn, Michael Paul Bieniosek, Franco Salvetti, Giovanni Lorenzo Thione, Ian Robert Collins, Toby Takeo Sterrett
  • Publication number: 20150039636
    Abstract: Modeling genealogical trees that span multiple pages can include the creation and use of navigable links between related nodes. When it is determined that a display layout of a genealogical tree will span a plurality of viewable pages by a document viewer, a descendent node is identified that genealogical links directly to a related ancestor node on another page. A selectable ancestor page link is then created and displayed proximate the descendent relative node which, when selected, causes the viewer to render the page containing the ancestor relative node. A selectable descendent page link is also created and displayed proximate the ancestor relative node which, when selected, causes the viewer to render the particular page containing the descendent relative node. Intelligent formatting can also be used to identify and remove or refrain from displaying duplicate branches of the genealogical tree.
    Type: Application
    Filed: February 7, 2014
    Publication date: February 5, 2015
    Applicant: BRIGHAM YOUNG UNIVERSITY
    Inventors: Thomas W. Sederberg, William A. Barrett
  • Patent number: 8949255
    Abstract: Techniques are provided for storing files in a parallel computing system using sub-files with semantically meaningful boundaries. A method is provided for storing at least one file generated by a distributed application in a parallel computing system. The file comprises one or more of a complete file and a plurality of sub-files. The method comprises the steps of obtaining a user specification of semantic information related to the file; providing the semantic information as a data structure description to a data formatting library write function; and storing the semantic information related to the file with one or more of the sub-files in one or more storage nodes of the parallel computing system. The semantic information provides a description of data in the file. The sub-files can be replicated based on semantically meaningful boundaries.
    Type: Grant
    Filed: June 28, 2012
    Date of Patent: February 3, 2015
    Assignees: EMC Corporation, Los Alamos National Security, LLC
    Inventors: Sorin Faibish, John M. Bent, Percy Tzelnic, Gary Grider, Aaron Torres
  • Patent number: 8949256
    Abstract: One or more embodiments of the disclosure include systems and methods for obtaining information from electronic documents (e.g., web pages). Example embodiments include retrieving an electronic document, parsing the electronic document to identify multiple portions of the electronic document, and comparing the portions to identify information about the electronic document, such as the owner of the electronic document. Further, the identified information can be associated with the electronic document within a database.
    Type: Grant
    Filed: September 12, 2012
    Date of Patent: February 3, 2015
    Assignee: Facebook, Inc.
    Inventor: Ajaipal Singh Virdy
  • Publication number: 20150032764
    Abstract: A parallel tree labeling apparatus and method for processing an eXtensible Markup Language document (XML). The parallel tree labeling apparatus for processing an XML document includes a data distributor configured to divide the XML document into a plurality of data blocks; and a labeling component configured to receive elements of each of the plurality of data blocks, perform a labeling procedure on the plurality of data blocks in parallel, and generate a final label by combining partial labels.
    Type: Application
    Filed: July 28, 2014
    Publication date: January 29, 2015
    Inventors: Kyong-Ha LEE, Hye-Bong CHOI, Won-Joo PARK, Kee-Seong CHO, Won RYU
  • Publication number: 20150032765
    Abstract: A method according to one embodiment includes determining the presence of pre-existing metadata associated with at least one local media content file. The method of this embodiment may also include determining at least one data field contained within the pre-existing metadata and generating a homogeneous metadata file for the at least one local media content file by mapping data contained within the at least one data field of the pre-existing metadata into at least one defined data field of the homogeneous metadata file.
    Type: Application
    Filed: September 10, 2014
    Publication date: January 29, 2015
    Inventor: Eric N. Klein, JR.
  • Patent number: 8943075
    Abstract: An object class is disclosed. The object class comprises a mixed-tenanted object class. An instance of the mixed tenanted object class is indicated as tenanted or is indicated as non-tenanted.
    Type: Grant
    Filed: October 31, 2008
    Date of Patent: January 27, 2015
    Assignee: Workday, Inc.
    Inventors: Salvador Maiorano, Kashif Qayyum, Jon Ruggiero
  • Patent number: 8943076
    Abstract: Profiles associated with two applications are received. Each profile identifies a set of data fields identified by a corresponding full path name. Associations between data fields of the profiles are identified based on mapping pairs included in a full path mapping database, mapping pairs included in a shortest unique path mapping database, and mapping pairs included in a leaf mapping database. A prioritized list of mapping suggestions is provided based on the identified associations. A mapping suggestion can include a data manipulation operation according to information associated with a corresponding mapping pair.
    Type: Grant
    Filed: February 6, 2012
    Date of Patent: January 27, 2015
    Assignee: Dell Products, LP
    Inventors: Mitchell J. Stewart, James T. Ahlborn
  • Patent number: 8943027
    Abstract: Methods, systems, and computer readable media for content item purging are provided. A contact item purger, such as may be incorporated within a local client application of a content management system running on a user device, may leverage knowledge as to which items have been uploaded to the content management system, and how long such content items have been stored on the user device, to propose items for deletion from the user device so as to reclaim storage space. A contact item purger may run on one or more user devices, and may activate upon various triggering events, based on various conditions and parameters, with or without user interaction, thus maintaining available memory capacity at all times.
    Type: Grant
    Filed: November 20, 2013
    Date of Patent: January 27, 2015
    Assignee: Dropbox, Inc.
    Inventors: Michael Dwan, Anthony Grue, Daniel Kluesing
  • Patent number: 8943062
    Abstract: A first server is configured to receive one or more summarized data groups from a second server. Each summarized data group may include: information regarding a quantity of a group of records, where the group of records includes records associated with a record type and a time interval; information regarding a quantity of records associated with an indicator within the group of records; and information regarding a failure rate associated with the group of records based on the quantity of records associated with the group of records and the quantity of records associated with the indicator within the group of records. The first server is further configured to determine a threshold based on the summarized data groups and based on the failure rates associated with the summarized data groups and send an indication to the client device based on determining that the failure rate does not satisfy the threshold.
    Type: Grant
    Filed: June 15, 2012
    Date of Patent: January 27, 2015
    Assignee: Cellco Partnership
    Inventors: Jeffrey L. Baumgartner, Eric W. Baumgartner, Michael W. Monsey
  • Publication number: 20150026200
    Abstract: A computer-implemented method of extracting data from a document in an electronic format. The method includes the steps of accessing a file in an electronic format from a memory module; extracting data from the file corresponding to a plurality of keys contained within a mapping structure stored in the memory module; organizing the extracted data into values, wherein each value maps to one of the plurality of keys to form a hash map; storing the hash map in a database; and providing a user access to the database via an output device. The output device allows the user to view a customizable document whose content is derived from the values and keys stored in the database.
    Type: Application
    Filed: July 25, 2014
    Publication date: January 22, 2015
    Inventor: Stephen A. Lobo
  • Publication number: 20150026178
    Abstract: A method for subject-matter analysis of tabular data is provided in the illustrative embodiments. A first document including the tabular data is received. A library of functional signatures for a first subject-matter domain is selected. A determination is made whether a threshold number of functional signatures from the selected library are applicable to the tabular data, wherein a functional signature is applicable to the tabular data when values in the tabular data correspond to an operation and a table structure specified in the functional signature. Responsive to the threshold number of functional signatures from the selected library being applicable to the tabular data, a processor and a memory process the first document according to a process for the first subject matter domain selected from a plurality of processes for respective subject matter domains.
    Type: Application
    Filed: November 26, 2013
    Publication date: January 22, 2015
    Applicant: International Business Machines Corporation
    Inventors: Donna Karen Byron, Scott N. Gerard, Alexander Pikovsky, Matthew B. Sanchez
  • Publication number: 20150026199
    Abstract: In an exemplary embodiment of this disclosure, a computer-implemented method includes receiving, at a hardware accelerator, a first instruction to project a first plurality of database rows, where each of the first plurality of database rows has one or more variable-length columns. The first plurality of database rows are projected, by a computer processor, to produce a first plurality of projected rows. This projection is performed at streaming rate.
    Type: Application
    Filed: August 20, 2013
    Publication date: January 22, 2015
    Applicant: International Business Machines Corporation
    Inventors: Sameh W. Assad, Hong Min, Bharat Sukhwani, Mathew S. Thoennes
  • Publication number: 20150026198
    Abstract: In an exemplary embodiment of this disclosure, a computer-implemented method includes receiving, at a hardware accelerator, a first instruction to project a first plurality of database rows, where each of the first plurality of database rows has one or more variable-length columns. The first plurality of database rows are projected, by a computer processor, to produce a first plurality of projected rows. This projection is performed at streaming rate.
    Type: Application
    Filed: July 19, 2013
    Publication date: January 22, 2015
    Applicant: International Business Machines Corporation
    Inventors: Sameh W. Assad, Hong Min, Bharat Sukhwani, Mathew S. Thoennes
  • Publication number: 20150026197
    Abstract: In an exemplary embodiment of this disclosure, a computer-implemented method includes determining that a database query warrants a first projection operation to project a plurality of input rows to a plurality of projected rows, where each of the plurality of input rows has one or more variable-length columns. A first projection control block is constructed, by a computer processor, to describe the first projection operation. The first projection operation is offloaded to a hardware accelerator. The first projection control block is provided to the hardware accelerator, and the first projection control block enables the hardware accelerator to perform the first projection operation at streaming rate.
    Type: Application
    Filed: July 19, 2013
    Publication date: January 22, 2015
    Inventors: Sameh W. Assad, Parijat Dube, Hong Min, Bharat Sukhwani, Mathew S. Thoennes
  • Patent number: 8938450
    Abstract: A system and a method for microcontent natural language processing are presented. The method comprising steps of receiving a microcontent message from a social networking server, tokenizing the microcontent message into one or more text tokens, detecting the language of the microcontent message and selecting the property dictionary for part-of-speech tag, part-of-speech tagging the microcontent message to identify related pronouns and nouns based on the selected dictionary, and extracting topics form the microcontent messages and assigning confidence values to the topics.
    Type: Grant
    Filed: August 30, 2013
    Date of Patent: January 20, 2015
    Assignee: Bottlenose, Inc.
    Inventors: Nova Spivack, Dominiek ter Heide
  • Publication number: 20150019576
    Abstract: Generating a data parser for parsing an input stream of data objects includes: receiving information representative of a hierarchical data format defining a plurality of objects organized in a hierarchy, the objects including one or more schema objects representing data objects, and one or more container objects each associated with one or more schema objects; and processing the received information to form the data parser.
    Type: Application
    Filed: November 22, 2013
    Publication date: January 15, 2015
    Inventors: Mark E. Seneski, Alexander Shulman
  • Patent number: 8935265
    Abstract: A method, device and system for acquiring information related to annotations and the content of a document. Annotations are isolated from document content and are associated with portions of the content of the document. Annotations and content are used as a basis for a semantic search of a corpus of other documents. From the corpus, related information is extracted and presented or made available along side of or with the original content and annotations of the document. Each version of a document is stored and made accessible. Any of the versions of a document, with or without a current set of annotations, may be distributed to others for further review and annotation. Annotations are protected and associated with a level of privilege or rights. Annotations are trackable over time and location and are associated with a particular annotator.
    Type: Grant
    Filed: August 30, 2011
    Date of Patent: January 13, 2015
    Assignee: ABBYY Development LLC
    Inventor: Ding-Yuan Tang
  • Patent number: 8935266
    Abstract: An identity search algorithm for identifying a plurality of identity data such as personal names and entity names that might exist in a table or file containing a large number of identity data in an investigative environment. The algorithm is intended to identify person and entities in the shortest time possible with an overly inclusive results. The core algorithm is used in an environment with a growing number of names in the table, and implemented with a web-based user interface, it can dramatically improve identity-searching efficiency and increase the chance to generate useful leads in typical discovery and investigation.
    Type: Grant
    Filed: February 4, 2013
    Date of Patent: January 13, 2015
    Inventor: Jianqing Wu
  • Publication number: 20150012550
    Abstract: Systems and methods of analyzing message data. An embodiment is a method of analyzing message data including a plurality of messages associated with one or more users. The method is performed using a computing system comprising a computer storage medium and a computer processor. The system parses each message of the plurality of messages to identify a plurality of message segments. The system assigns the message segments to the one or more users. The assignment is based at least in part on a determination of whether each message of the plurality of messages is a reply message. The segments of the message are assigned to a reply user if the message is determined to be a reply message. The system applies a statistical model to the assigned message segments, to determine predicted locations for the users. The system outputs the predicted locations for the users.
    Type: Application
    Filed: July 8, 2013
    Publication date: January 8, 2015
    Inventors: Veerasundaravel THIRUGNANASUNDARAM, Tong Sun, David R. Vandervort, Arun Bakthavachalu
  • Publication number: 20150012551
    Abstract: Methods, devices, and systems may be used for semantics publishing and discovery. In an embodiment, a method for publishing semantics related resource identifiers may include adding a key word to an identifier of a semantics related resource and publishing the identifier to at least one of a sibling node and a child node. In another embodiment, a method may include using a Bloom filter to publish a semantics related resource. In another embodiment, a method may include publishing, by a semantics node, an identifier of a semantics related resource to a sibling node, while publishing a digest of the semantics node to a child node.
    Type: Application
    Filed: July 2, 2014
    Publication date: January 8, 2015
    Inventors: Lijun Dong, Catalina M. Mladin, Dale N. Seed, Guang Lu
  • Patent number: 8930379
    Abstract: A technique for managing distributed email content uses metadata to identify the location of email maintained on one or more data storage resources. When processing email content for display, the metadata is consulted and a display output is generated that presents a unified email folder view in which the email is represented. Email may be represented in the unified email folder view regardless of whether the email's data storage resource is accessible. Email content that is not currently accessible may be identified as being inaccessible. The email may be identified in the unified email folder view by specifying the data storage resource on which it is stored. The unified email folder view may also include at least one email folder representation that merges email content from more than one data storage resource.
    Type: Grant
    Filed: November 28, 2006
    Date of Patent: January 6, 2015
    Assignee: International Business Machines Corporation
    Inventors: Brent J. Baude, Jaroslaw Miszczyk, Kathryn Sintal, Gottfried Schimunek
  • Patent number: 8930380
    Abstract: Automatically generating a parser is disclosed. Raw data is received from a first remote device. A determination that the raw data does not, within a predefined confidence measure, conform to any rules included in a set of rules is made. A clustering function is performed on the raw data. At least one parser rule is generated based on the clustering.
    Type: Grant
    Filed: June 30, 2011
    Date of Patent: January 6, 2015
    Assignee: Sumo Logic
    Inventors: Kumar Saurabh, Christian Friedrich Beedgen, Bruno Kurtic
  • Publication number: 20150006555
    Abstract: A message publishing and subscribing method and apparatus, which relate to the information processing field and provide higher information transmission efficiency and better flexibility mainly by providing a corresponding dummy topic in a message broker for a publisher and a subscriber or by performing semantic recognition in the message broker for the publisher and the subscriber.
    Type: Application
    Filed: September 18, 2014
    Publication date: January 1, 2015
    Inventors: Yuan Fang, Guanjun Tang, Yunpeng Wang
  • Publication number: 20150006554
    Abstract: An approach is provided for an information handling system that includes a processor and a memory to analyze documents. In the approach, an electronic document is received with the document including content, such as text, and revision metadata that is associated with the content. The revision metadata is analyzed and the approach identifies a confidence level based on the analysis. The confidence level is associated with the electronic document content. The confidence level can then be utilized by a Question and Answer (QA) system.
    Type: Application
    Filed: June 27, 2013
    Publication date: January 1, 2015
    Inventors: Paul R. Bastide, Matthew E. Broomhall, Robert E. Loredo, Fang Lu
  • Publication number: 20150006468
    Abstract: A method and apparatus for parallelization of data processing. The method including: parsing a data processing flow to split a write table sequence for the data processing flow; generating a plurality of instances of the data processing flow based at least in part on the split write table sequence; and scheduling the plurality of instances for parallelization of data processing.
    Type: Application
    Filed: June 9, 2014
    Publication date: January 1, 2015
    Inventors: Ning Duan, Wei Huang, Peng Ji, Yi Qi, Qi Zhang, Jun Zhu
  • Publication number: 20140372460
    Abstract: A method of extracting unclassified data from a collection of data including both classified data and unclassified data, includes: providing a plain text format file including a plurality of attributes; using the attributes to identify unclassified data within a collection of data that includes a combination of unclassified and classified data; and extracting the identified unclassified data from the collection of data. An apparatus that implements the method is also provided.
    Type: Application
    Filed: June 12, 2014
    Publication date: December 18, 2014
    Inventors: William Joy, Armen Djougarian
  • Patent number: 8914413
    Abstract: A processor-implemented method, system, and/or computer program product defines multiple context-based data gravity wells on a context-based data gravity wells membrane. Non-contextual data objects are associated with context objects to define synthetic context-based objects. The synthetic context-based objects are parsed into an n-tuple that includes a pointer to one of the non-contextual data objects, a probability that a non-contextual data object has been associated with a correct context object, and a weighting factor of importance of the synthetic context-based object. A virtual mass of each parsed synthetic context-based object is calculated, in order to define a shape of multiple context-based data gravity wells that are created when synthetic context-based objects are pulled into each of the context-based data gravity well frameworks on a context-based data gravity wells membrane.
    Type: Grant
    Filed: January 2, 2013
    Date of Patent: December 16, 2014
    Assignee: International Business Machines Corporation
    Inventors: Samuel S. Adams, Robert R. Friedlander, James R. Kraemer, Jeb R. Linton
  • Patent number: 8914388
    Abstract: An embodiment of the invention includes a method for centralized URL commenting, wherein user-generated comment data is extracted from web pages on a plurality of web sites. Access control parameters are also obtained from the web sites. The comment data is tagged with identifiers indicating the web sites that the comment data was extracted from, URLs indicating the web pages that the comment data are on, and authors of the comment data. The comment data is stored in a repository. Keywords are extracted from the comment data; and, the keywords are normalized. The normalizing of the keywords includes creating a single normalized keyword for multiple keywords related to the same topic, and tagging comment data that include at least one of the multiple keywords with the normalized keyword. Read access and/or write access to the repository is controlled based on the access control parameters.
    Type: Grant
    Filed: February 18, 2011
    Date of Patent: December 16, 2014
    Assignee: International Business Machines Corporation
    Inventors: Thomas J. Burris, Brian D. Goodman
  • Patent number: 8909657
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transferring electronic data. In general, one aspect of the subject matter described in this specification can be embodied in methods that include the actions of identifying a data item to be chunked; determining the type of the data item; determining whether the type of the data item is one of a specified one or more types; if it is determined that the type of the data item is not one of the specified one or more types, performing a first chunking of the data item; and if it is determined that the type of the data item is one of the specified one or more types, performing a second chunking of the data item that is based on the particular content portions of the data item.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: December 9, 2014
    Assignee: Apple Inc.
    Inventors: James L. Mensch, Cameron Stuart Birse, Ronnie G. Misra, Eric Olaf Carlson, Dominic B. Giampaolo
  • Patent number: 8903836
    Abstract: A system and method is disclosed which enables network administrators and the like to quickly analyze the data produced by log-producing devices such as network firewalls and routers. Unlike systems of the prior art, the system disclosed herein automatically parses and summarizes log data before inserting it into one or more databases. This greatly reduces the volume of data stored in the database and permits database queries to be run and reports generated while many types of attempted breaches of network security are still in progress. Database maintenance may also be accomplished automatically by the system to delete or archive old log data.
    Type: Grant
    Filed: July 30, 2012
    Date of Patent: December 2, 2014
    Assignee: TIBCO Software Inc.
    Inventors: Jason Michael DeStefano, Thomas Hunt Schabo Grabowski
  • Publication number: 20140351275
    Abstract: An electronic document is parsed against a plurality of phrases. Each of the plurality of phrases indicates a text effect. It is determined that the electronic document includes a phrase at least similar to a first phrase of the plurality of phrases. A first contributor of the electronic document that is associated with the phrase is determined. A first text effect indicated by the phrase is determined. A mapping is created between the first contributor and the first text effect indicated by the phrase. The mapping is supplied for presenting of the electronic document.
    Type: Application
    Filed: May 21, 2013
    Publication date: November 27, 2014
    Applicant: International Business Machines Corporation
    Inventors: Bernadette A. Carter, Kathryn Lemanski Mercer, Cesar A. Wong
  • Patent number: 8898177
    Abstract: A plurality of segments in an e-mail collection by parsing content of e-mails is generated. Corresponding segment signature for each segment is created and a signature index is populated using the generated segment signatures. After receiving a query e-mail, a plurality of query segments in the query e-mail is generated using content of the query e-mail and corresponding query segment signature for each query segment is generated. A query root segment is identified and corresponding query root segment signature is generated. A set of root segment signatures of the signature index is identified and the query root segment signature is compared with each root segment signature from the signature index. A subset of the signature index is identified, using a match between the root segment signature and the query root segment signature. An e-mail thread hierarchy is built using the identified subset of the signature index.
    Type: Grant
    Filed: September 10, 2010
    Date of Patent: November 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Danish Contractor, Manjula Golla Hosurmath, Sachindra Joshi, Kenney Ng