Patents Assigned to Ab Initio Technology LLC
  • Patent number: 11741091
    Abstract: Among other things, we describe a method of receiving a portion of metadata from a data source, the portion of metadata describing nodes and edges; generating instances of a data structure representing the portion of metadata, at least one instance of the data structure including an identification value that identifies a corresponding node, one or more property values representing respective properties of the corresponding node, and one or more pointers to respective identification values, each pointer representing an edge associated with a node identified by the corresponding respective identification value; storing the instances of the data structure in random access memory; receiving a query that includes an identification of at least one particular element of data; and using at least one instance of the data structure to cause a display of a computer system to display a representation of lineage of the particular element of data.
    Type: Grant
    Filed: December 1, 2017
    Date of Patent: August 29, 2023
    Assignee: Ab Initio Technology LLC
    Inventors: David Clemens, Dusan Radivojevic, Neil Galarneau
  • Patent number: 11734264
    Abstract: A method includes accessing a schema that specifies relationships among datasets, computations on the datasets, or transformations of the datasets, selecting a dataset from among the datasets, and identifying, from the schema, other datasets that are related to the selected dataset. Attributes of the datasets are identified, and logical data representing the identified attributes and relationships among the attributes is generated. The logical data is provided to a development environment, which provides access to portions of the logical data representing the identified attributes. A specification that specifies at least one of the identified attributes in performing an operation is received from the development environment.
    Type: Grant
    Filed: December 21, 2021
    Date of Patent: August 22, 2023
    Assignee: Ab Initio Technology LLC
    Inventors: Jonah Egenolf, Marshall A. Isman, Ian Schechter
  • Patent number: 11720583
    Abstract: In a first aspect, a method includes, at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage, executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster, receiving a computer-executable program by the data processing engine, executing at least part of the program by the first instance of the data processing engine, receiving, by the data processing engine, a second portion of data from the external data source, storing the second portion of data other than in HDFS storage, and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data.
    Type: Grant
    Filed: August 1, 2022
    Date of Patent: August 8, 2023
    Assignee: Ab Initio Technology LLC
    Inventors: Ian Schechter, Tim Wakeling, Ann M. Wollrath
  • Patent number: 11704494
    Abstract: A data processing system for discovering a semantic meaning of a field included in one or more data sets is configured to identify a field included in one or more data sets, with the field having an identifier. For that field, the system profiles data values of the field to generate a data profile, accesses a plurality of label proposal tests, and generates a set of label proposals by applying the plurality of label proposal tests to the data profile. The system determines a similarity among the label proposals and selects a classification. The system identifies one of the label proposals as identifying the semantic meaning. The system stores the identifier of the field with the identified one of the label proposals that identifies the semantic meaning.
    Type: Grant
    Filed: February 19, 2020
    Date of Patent: July 18, 2023
    Assignee: Ab Initio Technology LLC
    Inventors: Christopher Thurston Butler, Timothy Spencer Bush
  • Patent number: 11669343
    Abstract: A method is described for processing keyed data items that are each associated with a value of a key, the keyed data items being from a plurality of distinct data streams, the processing including collecting the keyed data items, determining, based on contents of at least one of the keyed data items, satisfaction of one or more specified conditions for execution of one or more actions and causing execution of at least one of the one or more actions responsive to the determining.
    Type: Grant
    Filed: September 17, 2021
    Date of Patent: June 6, 2023
    Assignee: Ab Initio Technology LLC
    Inventors: Oded Ravid, Trevor Murphy
  • Publication number: 20230093911
    Abstract: Techniques for determining processing layouts to nodes of a dataflow graph.
    Type: Application
    Filed: September 30, 2022
    Publication date: March 30, 2023
    Applicant: Ab Initio Technology LLC
    Inventor: Garth Allen Dickie
  • Patent number: 11615093
    Abstract: A method for clustering data elements stored in a data storage system includes reading data elements from the data storage system. Clusters of data elements are formed with each data element being a member of at least one cluster. At least one data element is associated with two or more clusters. Membership of the data element belonging to respective ones of the two or more clusters is represented by a measure of ambiguity. Information is stored in the data storage system to represent the formed clusters.
    Type: Grant
    Filed: February 16, 2017
    Date of Patent: March 28, 2023
    Assignee: Ab Initio Technology LLC
    Inventor: Arlen Anderson
  • Patent number: 11599337
    Abstract: A method for configuring a first computer executable program includes through a user interface, receiving information indicative of a source of data and a data target; and receiving a characterization of a process, including a type of the process and values for characteristics associated with the process. The method includes based on the received information, automatically assigning values to respective parameters of the first computer executable program to cause the first computer executable program to, when executed, receive data from the source of data and output data to the data target. The method includes automatically configuring the first computer executable program to reference a second computer executable program, including identifying the second computer executable program based on the type of the process; and assigning values to respective parameters of the second computer executable program based on the values for the respective characteristics.
    Type: Grant
    Filed: October 25, 2021
    Date of Patent: March 7, 2023
    Assignee: Ab Initio Technology LLC
    Inventors: Richard A. Epstein, Mike Palmer
  • Patent number: 11599509
    Abstract: An approach to parallel access of data from a distributed filesystem provides parallel access to one or more named units (e.g., files) in the filesystem by creating multiple parallel data streams such that all the data of the desired units is partitioned over the multiple streams. In some examples, the multiple streams form multiple inputs to a parallel implementation of a computation system, such as a graph-based computation system, dataflow-based system, and/or a (e.g., relational) database system.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: March 7, 2023
    Assignee: Ab Initio Technology LLC
    Inventors: Ann M. Johnson, Bryan Phil Douros, Marshall Alan Isman, Timothy Wakeling
  • Patent number: 11593380
    Abstract: Techniques for generating a dataflow graph include generating a first dataflow graph with a plurality of first nodes representing first computer operations in processing data, with at least one of the first computer operations being a declarative operation that specifies one or more characteristics of one or more results of processing of data, and transforming the first dataflow graph into a second dataflow graph for processing data in accordance with the first computer operations, the second dataflow graph including a plurality of second nodes representing second computer operations, with at least one of the second nodes representing one or more imperative operations that implement the logic specified by the declarative operation, where the one or more imperative operations are unrepresented by the first nodes in the first dataflow graph.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: February 28, 2023
    Assignee: Ab Initio Technology LLC
    Inventors: Ian Schechter, Garth Dickie
  • Patent number: 11593369
    Abstract: One method includes receiving a database query, receiving information about a database table in data storage populated with data elements, producing a structural representation of the database table that includes a formatted data organization reflective of the database table and is absent the data elements of the database table, and providing the structural representation and the database query to a plan generator capable of producing a query plan representing operations for executing the database query on the database table. Another method includes receiving a query plan from a plan generator, the plan representing operations for executing a database query on a database table, and producing a dataflow graph from the query plan, wherein the dataflow graph includes at least one node that represents at least one operation represented by the query plan, and includes at least one link that represents at least one dataflow associated with the query plan.
    Type: Grant
    Filed: April 25, 2017
    Date of Patent: February 28, 2023
    Assignee: Ab Initio Technology LLC
    Inventors: Ian Schechter, Glenn John Allin, J. Skeffington Wholey
  • Patent number: 11561993
    Abstract: A data processing system for producing a subset of data from a plurality of data sources, including: memory storing a plurality of data sources to be represented in an editor interface; a data structure modification module that selects a plurality of data sources to be represented in an editor interface and generates a subset of data included in the plurality of data sources; memory that stores the selected data structures included in the subset, with at least one of the stored data structures including the one or more modified attributes of the one or more respective fields; rendering module that displays, in the editor interface, representations of the stored data structures; and a segmentation modules that segments a plurality of received data records.
    Type: Grant
    Filed: October 18, 2018
    Date of Patent: January 24, 2023
    Assignee: Ab Initio Technology LLC
    Inventors: Trevor Murphy, Oded Ravid
  • Patent number: 11531775
    Abstract: A method includes automatically determining a component of a security label for each first record in a first table of a database having multiple tables, including: identifying a second record related to the first record according to a foreign key relationship; identifying a component of the security label for the second record; and assigning a value for the component of the security label for the first record based on the identified component of the security label for the second record. The method includes storing the determined security label in the record.
    Type: Grant
    Filed: November 5, 2015
    Date of Patent: December 20, 2022
    Assignee: Ab Initio Technology LLC
    Inventor: Christopher J. Winters
  • Publication number: 20220374413
    Abstract: A data processing system configured to perform: obtaining a first data lineage representing relationships among physical data elements, the first data lineage being generated at least in part by performing at least one of: (a) analyzing source code of at least one computer program configured to access the physical data elements; and (b) analyzing information obtained during runtime of the at least one computer program; obtaining, based on user input, a second data lineage representing relationships among business data elements; obtaining an association between at least some of the physical data elements of the first data lineage and at least some of the business data elements of the second data lineage; and generating, based on the association between the physical data elements and the business data elements, an indication of agreement or discrepancy between the first data lineage and the second data lineage.
    Type: Application
    Filed: January 14, 2022
    Publication date: November 24, 2022
    Applicant: Ab Initio Technology LLC
    Inventors: Joel Gould, Dusan Radivojevic
  • Patent number: 11487732
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for database key identification. One of the methods includes receiving an identification of a first field in a first data set, the first data set including records. The method includes identifying a set of values, the set including, for each record, a value associated with the field. The method includes generating a filter mask based on the set of values, where application of the filter mask is capable of determining that a given value is not in the set of values. The method includes receiving a second data set including a second field, the second data set including records. The method includes determining a count of a number of records in the second data set having a value associated with the second field that passes the filter mask. The method also includes storing the count in a profile.
    Type: Grant
    Filed: January 16, 2014
    Date of Patent: November 1, 2022
    Assignee: Ab Initio Technology LLC
    Inventor: Timothy Spencer Bush
  • Patent number: 11487529
    Abstract: A computer-implemented method for integrating client portals of underlying data processing applications through a shared log record, including: storing one or more log records that are each shared by the process management application and the version control application; receiving instructions through a user interface that integrates, through the shared one or more log records, the process management client portal with the version control client portal; in response to the receiving of the instructions, executing the received instructions, the executing of the received instructions including: selecting, by the version control application, a particular version of the rule from the multiple versions of the rule stored in the system storage; and transitioning, by the process management application, the particular version of the rule from the first state of the plurality of states to the second, different state of the plurality of states.
    Type: Grant
    Filed: June 25, 2020
    Date of Patent: November 1, 2022
    Assignee: Ab Initio Technology LLC
    Inventors: Scott Studer, Joel Gould, Amit Weisman
  • Patent number: 11487534
    Abstract: A method for analyzing a computer program ecosystem includes performing a static analysis, including identifying static dependencies among elements of the ecosystem based on values of parameters in one or more parameter sets associated with the ecosystem, the elements of the ecosystem including the computer programs of the ecosystem and data resources associated with the computer programs. The method includes performing a runtime analysis, including identifying elements of the ecosystem that were utilized during execution of the ecosystem to process data records. The method includes performing a schedule analysis, including identifying a computer program of the ecosystem that has a schedule dependency from another computer program of the ecosystem. The method includes identifying a subset of the elements of the ecosystem as an ecosystem unit based on the results of the static, runtime, and schedule analyses. The method includes migrating the ecosystem unit, testing the ecosystem unit, or both.
    Type: Grant
    Filed: May 3, 2021
    Date of Patent: November 1, 2022
    Assignee: Ab Initio Technology LLC
    Inventors: John Joyce, Marshall A. Isman, Sam Kendall
  • Patent number: 11475023
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for impact analysis. One of the methods includes receiving information about at least two logical datasets, the information identifying, for each logical dataset, a field in that logical dataset and format information about that field. The method includes receiving information about a transformation identifying a first logical dataset from which the transformation is to receive data and a second logical dataset to which the transformed data is provided. The method includes receiving one or more proposed changes to at least one of the fields. The method includes analyzing the proposed changes based on information about the transformation and information about the first logical dataset and the second logical dataset. The method includes calculating metrics of the proposed change based on the analysis. The method also includes storing information about the metrics.
    Type: Grant
    Filed: November 26, 2018
    Date of Patent: October 18, 2022
    Assignee: Ab Initio Technology LLC
    Inventors: Joel Gould, Scott Studer
  • Patent number: 11455229
    Abstract: A method for displaying differences between a first executable dataflow graph and a second executable dataflow graph includes comparing a specification of the first executable dataflow graph and a specification of the second executable dataflow graph, including at least one of identifying a particular node or link of the first dataflow graph that does not correspond to any node or link of the second dataflow graph; and identifying a first node or link of the first dataflow graph that corresponds to a second node or link of the second dataflow graph, and identifying a difference between the first node or link and the second node or link. The method includes formulating and displaying a graphical representation of at least some of the nodes or links of the first dataflow graph or the second dataflow graph, the graphical representation including a graphical indicator of at least one of the identified particular node or link the identified difference between the first node or link and the second node or link.
    Type: Grant
    Filed: October 9, 2020
    Date of Patent: September 27, 2022
    Assignee: Ab Initio Technology LLC
    Inventors: Ilya Rozenberg, Adam Weiss
  • Patent number: 11423083
    Abstract: A method performed by a computer system including: accessing a specification that specifies a plurality of modules to be implemented by the computer program for processing the one or more values of the one or more fields in the structured data item; transforming the specification into the computer program that implements the plurality of modules, wherein the transforming includes: for each of one or more first modules of the plurality of modules: identifying one or more second modules of the plurality of modules that each receive input that is at least partly based on an output of the first module; and formatting an output data format of the first module such that the first module outputs only one or more values of one or more fields of the structured data item.
    Type: Grant
    Filed: October 27, 2017
    Date of Patent: August 23, 2022
    Assignee: Ab Initio Technology LLC
    Inventors: Jonah Egenolf, Marshall A. Isman, Frederic Wild