Patents Assigned to Ab Initio Technology LLC
  • Patent number: 9576028
    Abstract: In one aspect, in general, a method of generating a dataflow graph representing a database query includes receiving a query plan from a plan generator, the query plan representing operations for executing a database query on at least one input representing a source of data, producing a dataflow graph from the query plan, wherein the dataflow graph includes at least one node that represents at least one operation represented by the query plan, and includes at least one link that represents at least one dataflow associated with the query plan, and altering one or more components of the dataflow graph based on at least one characteristic of the at least one input representing the source of data.
    Type: Grant
    Filed: February 23, 2015
    Date of Patent: February 21, 2017
    Assignee: Ab Initio Technology LLC
    Inventors: Ian Schechter, Glenn John Allin
  • Patent number: 9569189
    Abstract: An approach to automatically specifying, or assisting with the specification of, a parallel computation graph involves determining data processing characteristics of the linking elements that couple data processing elements of the graph. The characteristics of the linking elements are determined according to the characteristics of the upstream and/or downstream data processing elements associated with the linking element, for example, to enable computation by the parallel computation graph that is equivalent to computation of an associated serial graph.
    Type: Grant
    Filed: November 14, 2011
    Date of Patent: February 14, 2017
    Assignee: Ab Initio Technology LLC
    Inventor: Craig W. Stanfill
  • Patent number: 9569528
    Abstract: Among other aspects disclosed are a method and system for detecting confidential information. The method includes reading stored data and identifying strings within the stored data, where each string includes a sequence of consecutive bytes which all have values that are in a predetermined subset of possible values. For each of at least some of the strings, determining if the string includes bytes representing one or more format matches, wherein a format match includes a set of values that match a predetermined format associated with confidential information. For each format match, testing the values that match the predetermined format with a set of rules associated with the confidential information to determine whether the format match is an invalid format match that includes one or more invalid values and calculating a score for the stored data, based at least in part upon the ratio of a count of invalid format matches to a count of other format matches.
    Type: Grant
    Filed: October 3, 2008
    Date of Patent: February 14, 2017
    Assignee: Ab Initio Technology LLC
    Inventor: David Fournier
  • Patent number: 9563411
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for flow analysis. In one aspect, a method includes modifying a dataflow graph, the dataflow graph including a plurality of paths connecting at least one entry point and at least one exit point, including adding components to the dataflow graph that add flow units to data records and remove flow units from data records, each flow unit identifying a segment of a path traversed by the data record. The method also includes identifying execution paths based on flow units obtained by processing a plurality of data records using the modified dataflow graph. The method also includes determining a subset of the plurality of data records, wherein a selected set of execution paths are represented by the subset.
    Type: Grant
    Filed: January 5, 2012
    Date of Patent: February 7, 2017
    Assignee: Ab Initio Technology LLC
    Inventor: Andrew F. Roberts
  • Patent number: 9563721
    Abstract: In one aspect, in general, a method is described for managing an archive for determining approximate matches associated with strings occurring in records. The method includes: processing records to determine a set of string representations that correspond to strings occurring in the records; generating, for each of at least some of the string representations in the set, a plurality of close representations that are each generated from at least some of the same characters in the string; and storing entries in the archive that each represent a potential approximate match between at least two strings based on their respective close representations.
    Type: Grant
    Filed: July 7, 2014
    Date of Patent: February 7, 2017
    Assignee: Ab Initio Technology LLC
    Inventor: Arlen Anderson
  • Patent number: 9547638
    Abstract: At least one rule specification is received for a graph-based computation having data processing components connected by linking elements representing data flows. The rule specification defines rules that are each associated with one or more rule cases that specify criteria for determining one or more output values that depend on input data. A transform is generated for at least one data processing component in the graph-based computation based on the received rule specification, including providing an interface for configuring characteristics of a log associated with the generated transform. At least one data flow is transformed using the generated transform, including: tracing execution of the data processing components in the graph-based computation at run time, generating log information based on the traced execution according to the configured log characteristics, and storing or outputting the generated log information.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: January 17, 2017
    Assignee: Ab Initio Technology LLC
    Inventors: Scott Studer, Joel Gould, David Phillimore
  • Patent number: 9507682
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for dynamic graph performance monitoring. One of the methods includes receiving multiple units of work that each include one or more work elements. The method includes determining a characteristic of the first unit of work. The method includes identifying, by a component of the first dataflow graph, a second dataflow graph from multiple available dataflow graphs based on the determined characteristic, the multiple available dataflow graphs being stored in a data storage system. The method includes processing the first unit of work using the second dataflow graph. The method includes determining one or more performance metrics associated with the processing.
    Type: Grant
    Filed: November 16, 2012
    Date of Patent: November 29, 2016
    Assignee: Ab Initio Technology LLC
    Inventors: Mark Buxbaum, Michael G. Mulligan, Tim Wakeling, Matthew Darcy Atterbury
  • Patent number: 9477786
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for metadata management. One of the methods includes receiving user input selecting a first node. The method includes receiving a first data lineage of a first object, the first object having a type, the first data lineage describing relationships between the first object and one or more datasets or transforms. The method includes receiving user input selecting a second node. The method includes receiving a second data lineage of a second object, the second object having the same type as the first object. The method includes performing a comparison of the first node and the first data lineage to the second node and the second data lineage. The method includes generating a report based on the comparison.
    Type: Grant
    Filed: March 13, 2014
    Date of Patent: October 25, 2016
    Assignee: Ab Initio Technology LLC
    Inventors: Gregg Yost, Dusan Radivojevic
  • Patent number: 9449057
    Abstract: A data storage system stores at least one dataset including a plurality of records. A data processing system, coupled to the data storage system, processes the plurality of records to produce codes representing data patterns in the records, the processing including: for each of multiple records in the plurality of records, associating with the record a code encoding one or more elements, wherein each element represents a state or property of a corresponding field or combination of fields as one of a set of element values, and, for at least one element of at least a first code, the number of element values in the set is smaller than the total number of data values that occur in the corresponding field or combination of fields over all of the plurality of records in the dataset.
    Type: Grant
    Filed: January 27, 2012
    Date of Patent: September 20, 2016
    Assignee: Ab Initio Technology LLC
    Inventor: Arlen Anderson
  • Publication number: 20160248643
    Abstract: A method for supporting communication between a client and a server includes receiving a first message from a client. The method also includes creating an object in response to the first message. The method also includes sending a response to the first message to the client. The method also includes receiving changes to the object from a server. The method also includes storing the changes to the object. The method also includes receiving a second message from the client. The method also includes sending the stored changes to the client with a response to the second message.
    Type: Application
    Filed: March 11, 2016
    Publication date: August 25, 2016
    Applicant: Ab Initio Technology LLC
    Inventors: Jennifer M. Farver, Joshua Goldshlag, David W. Parmenter, Ian Robert Schechter, Tim Wakeling
  • Patent number: 9418095
    Abstract: Managing changes to a collection of records includes storing a first set of records in a data storage system, the first set of records representing a first version of the collection of records, and validating a proposed change to the collection of records specified by an input received over a user interface. The data storage system is queried based on validation criteria associated with the proposed change, and a first result is received in response to the querying. A second set of records is processed representing changes not yet applied to the collection of records to generate a second result. The first result is updated based on the second result to generate a third result. The third result is processed to determine whether the proposed change is valid according to the validation criteria.
    Type: Grant
    Filed: January 13, 2012
    Date of Patent: August 16, 2016
    Assignee: Ab Initio Technology LLC
    Inventors: Joel Gould, Timothy Perkins, Adam Weiss
  • Patent number: 9413542
    Abstract: Managing data units broadcast from a data feed, without requiring re-transmission by a source of the data feed, includes: at a first node in a network, receiving at least a portion of a data feed including a plurality of data units; at a second node in the network, receiving at least a portion of the data feed; identifying an interruption in receiving the data feed at the first node; determining an extent of a data lacuna extending between a last data unit received by the first node prior to the interruption and a first data unit received by the first node after the interruption; and sending a request from the first node for results saved by the second node, the results saved by the second node corresponding to the data lacuna.
    Type: Grant
    Filed: August 7, 2014
    Date of Patent: August 9, 2016
    Assignee: Ab Initio Technology LLC
    Inventor: Craig W. Stanfill
  • Patent number: 9411712
    Abstract: Generating test data includes: reading values occurring in at least one field of multiple records from a data source; storing profile information including statistics characterizing the values; generating a model of a probability distribution for the field based on the statistics; generating multiple test data values using the generated model such that a frequency at which a given value occurs in the test data values corresponds to a probability assigned to that given value by the model; and storing a collection of test data including the test data values in a data storage system.
    Type: Grant
    Filed: June 9, 2010
    Date of Patent: August 9, 2016
    Assignee: Ab Initio Technology LLC
    Inventor: Carl Richard Feynman
  • Patent number: 9411531
    Abstract: Processing a plurality of data units to generate result information, includes: performing a data operation for each data unit of a first subset of data units from the plurality of data units, and storing information associated with a result of the data operation in a first set of one or more data structures stored in working memory space of a memory device; after an overflow condition on the working memory space is satisfied, storing information in overflow storage space of a storage device; and repeating an overflow processing procedure multiple times during the processing of the plurality of data units, the overflow processing procedure including: updating a new set of one or more data structures stored in the working memory space using at least some information stored in the overflow storage space.
    Type: Grant
    Filed: December 9, 2015
    Date of Patent: August 9, 2016
    Assignee: Ab Initio Technology LLC
    Inventors: Muhammad Arshad Khan, Stephen G. Rybicki, Joel Gould
  • Publication number: 20160188442
    Abstract: Among other things, a method includes, at a computer system on which one or more computer programs are executing, receiving a specification defining types of state information, receiving an indication that an event associated with at least one of the computer programs has occurred, the event associated with execution of a function of the computer program, collecting state information describing the state of the execution of the computer program when the event occurred, generating an entry corresponding to the event, the entry including elements of the collected state information, the elements of state information formatted according to the specification, and storing the entry. The log can be parsed to generate a visualization of computer program execution.
    Type: Application
    Filed: March 9, 2016
    Publication date: June 30, 2016
    Applicant: Ab Initio Technology LLC
    Inventors: Joseph Stuart Wood, Robert Freundlich
  • Patent number: 9361355
    Abstract: Received data records, each including one or more values in one or more fields, are processed to identify a matched data cluster. The processing includes: for selected data records, generating a query from one or more values; identifying one or more candidate data records from the received data records using the query; determining whether or not the selected data record satisfies a cluster membership criterion for at least one candidate data cluster of one or more existing data clusters containing the candidate records; and selecting the matched data cluster from among one or more candidate data clusters based at least in part on a growth criterion for the candidate data clusters, or initializing the matched data cluster with the selected data record if the selected data record does not satisfy a cluster membership criterion for any of the existing data clusters or based on a result of the growth criterion.
    Type: Grant
    Filed: November 15, 2012
    Date of Patent: June 7, 2016
    Assignee: Ab Initio Technology LLC
    Inventors: Arlen Anderson, Kamil Trojan
  • Patent number: 9354981
    Abstract: A memory module stores working data that includes data units. A storage system stores recovery data that includes sets of one or more data units. Transferring data units between the memory module and the storage system includes: maintaining an order among the data units included in the working data, the order defining a first contiguous portion and a second contiguous portion; and, for each of multiple time intervals, identifying any data units accessed from the working data during the time interval, and adding to the recovery data a set of two or more data units including: one or more data units from the first contiguous portion including any accessed data units, and one or more data units from the second contiguous portion including at least one data unit that has been previously added to the recovery data.
    Type: Grant
    Filed: September 26, 2014
    Date of Patent: May 31, 2016
    Assignee: Ab Initio Technology LLC
    Inventor: Joseph Skeffington Wholey, III
  • Publication number: 20160125197
    Abstract: A method includes automatically determining a component of a security label for each first record in a first table of a database having multiple tables, including: identifying a second record related to the first record according to a foreign key relationship; identifying a component of the security label for the second record; and assigning a value for the component of the security label for the first record based on the identified component of the security label for the second record. The method includes storing the determined security label in the record.
    Type: Application
    Filed: November 5, 2015
    Publication date: May 5, 2016
    Applicant: Ab Initio Technology LLC
    Inventor: Christopher J. Winters
  • Patent number: 9323824
    Abstract: An interface is provided on a computing device for interacting with data stored in a data repository. Input is received including information identifying two or more attributes, and information indicating an order for the identified attributes. A hierarchical data structure is stored, with an order of hierarchy levels corresponding to the indicated order. Multiple attribute values for the attributes are determined. The method includes assigning to each node of a first level at least one of the attribute values of a first attribute, and assigning to each node of a second level at least one of the attribute values of a second attribute, each of the nodes of the second level also being assigned respective ones of the attribute values assigned to one or more nodes of preceding levels. The interface is displayed including displaying interface elements associated with each of the nodes.
    Type: Grant
    Filed: September 12, 2011
    Date of Patent: April 26, 2016
    Assignee: Ab Initio Technology LLC
    Inventor: Joyce L. Vigneau
  • Patent number: 9323749
    Abstract: Profiling data includes processing an accessed collection of records, including: generating, for a first set of distinct values appearing in a first set of one or more fields, corresponding location information; generating, for the first set of fields, a corresponding list of entries identifying a distinct value from the first set of distinct values and the location information for the distinct value; generating, for a second set of one or more fields, a corresponding list of entries, with each entry identifying a distinct value from a second set of distinct values appearing in the second set of fields; and generating result information, based at least in part on: locating at least one record of the collection using the location information for at least one value appearing in the first set of fields, and determining at least one value appearing in the second set of fields of the located record.
    Type: Grant
    Filed: October 22, 2013
    Date of Patent: April 26, 2016
    Assignee: Ab Initio Technology LLC
    Inventor: Arlen Anderson