Patents Assigned to Ab Initio Technology LLC
-
Patent number: 11741091Abstract: Among other things, we describe a method of receiving a portion of metadata from a data source, the portion of metadata describing nodes and edges; generating instances of a data structure representing the portion of metadata, at least one instance of the data structure including an identification value that identifies a corresponding node, one or more property values representing respective properties of the corresponding node, and one or more pointers to respective identification values, each pointer representing an edge associated with a node identified by the corresponding respective identification value; storing the instances of the data structure in random access memory; receiving a query that includes an identification of at least one particular element of data; and using at least one instance of the data structure to cause a display of a computer system to display a representation of lineage of the particular element of data.Type: GrantFiled: December 1, 2017Date of Patent: August 29, 2023Assignee: Ab Initio Technology LLCInventors: David Clemens, Dusan Radivojevic, Neil Galarneau
-
Patent number: 11734264Abstract: A method includes accessing a schema that specifies relationships among datasets, computations on the datasets, or transformations of the datasets, selecting a dataset from among the datasets, and identifying, from the schema, other datasets that are related to the selected dataset. Attributes of the datasets are identified, and logical data representing the identified attributes and relationships among the attributes is generated. The logical data is provided to a development environment, which provides access to portions of the logical data representing the identified attributes. A specification that specifies at least one of the identified attributes in performing an operation is received from the development environment.Type: GrantFiled: December 21, 2021Date of Patent: August 22, 2023Assignee: Ab Initio Technology LLCInventors: Jonah Egenolf, Marshall A. Isman, Ian Schechter
-
Patent number: 11720583Abstract: In a first aspect, a method includes, at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage, executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster, receiving a computer-executable program by the data processing engine, executing at least part of the program by the first instance of the data processing engine, receiving, by the data processing engine, a second portion of data from the external data source, storing the second portion of data other than in HDFS storage, and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data.Type: GrantFiled: August 1, 2022Date of Patent: August 8, 2023Assignee: Ab Initio Technology LLCInventors: Ian Schechter, Tim Wakeling, Ann M. Wollrath
-
Patent number: 11704494Abstract: A data processing system for discovering a semantic meaning of a field included in one or more data sets is configured to identify a field included in one or more data sets, with the field having an identifier. For that field, the system profiles data values of the field to generate a data profile, accesses a plurality of label proposal tests, and generates a set of label proposals by applying the plurality of label proposal tests to the data profile. The system determines a similarity among the label proposals and selects a classification. The system identifies one of the label proposals as identifying the semantic meaning. The system stores the identifier of the field with the identified one of the label proposals that identifies the semantic meaning.Type: GrantFiled: February 19, 2020Date of Patent: July 18, 2023Assignee: Ab Initio Technology LLCInventors: Christopher Thurston Butler, Timothy Spencer Bush
-
Patent number: 11669343Abstract: A method is described for processing keyed data items that are each associated with a value of a key, the keyed data items being from a plurality of distinct data streams, the processing including collecting the keyed data items, determining, based on contents of at least one of the keyed data items, satisfaction of one or more specified conditions for execution of one or more actions and causing execution of at least one of the one or more actions responsive to the determining.Type: GrantFiled: September 17, 2021Date of Patent: June 6, 2023Assignee: Ab Initio Technology LLCInventors: Oded Ravid, Trevor Murphy
-
Publication number: 20230093911Abstract: Techniques for determining processing layouts to nodes of a dataflow graph.Type: ApplicationFiled: September 30, 2022Publication date: March 30, 2023Applicant: Ab Initio Technology LLCInventor: Garth Allen Dickie
-
Patent number: 11615093Abstract: A method for clustering data elements stored in a data storage system includes reading data elements from the data storage system. Clusters of data elements are formed with each data element being a member of at least one cluster. At least one data element is associated with two or more clusters. Membership of the data element belonging to respective ones of the two or more clusters is represented by a measure of ambiguity. Information is stored in the data storage system to represent the formed clusters.Type: GrantFiled: February 16, 2017Date of Patent: March 28, 2023Assignee: Ab Initio Technology LLCInventor: Arlen Anderson
-
Patent number: 11599337Abstract: A method for configuring a first computer executable program includes through a user interface, receiving information indicative of a source of data and a data target; and receiving a characterization of a process, including a type of the process and values for characteristics associated with the process. The method includes based on the received information, automatically assigning values to respective parameters of the first computer executable program to cause the first computer executable program to, when executed, receive data from the source of data and output data to the data target. The method includes automatically configuring the first computer executable program to reference a second computer executable program, including identifying the second computer executable program based on the type of the process; and assigning values to respective parameters of the second computer executable program based on the values for the respective characteristics.Type: GrantFiled: October 25, 2021Date of Patent: March 7, 2023Assignee: Ab Initio Technology LLCInventors: Richard A. Epstein, Mike Palmer
-
Patent number: 11599509Abstract: An approach to parallel access of data from a distributed filesystem provides parallel access to one or more named units (e.g., files) in the filesystem by creating multiple parallel data streams such that all the data of the desired units is partitioned over the multiple streams. In some examples, the multiple streams form multiple inputs to a parallel implementation of a computation system, such as a graph-based computation system, dataflow-based system, and/or a (e.g., relational) database system.Type: GrantFiled: August 31, 2020Date of Patent: March 7, 2023Assignee: Ab Initio Technology LLCInventors: Ann M. Johnson, Bryan Phil Douros, Marshall Alan Isman, Timothy Wakeling
-
Patent number: 11593380Abstract: Techniques for generating a dataflow graph include generating a first dataflow graph with a plurality of first nodes representing first computer operations in processing data, with at least one of the first computer operations being a declarative operation that specifies one or more characteristics of one or more results of processing of data, and transforming the first dataflow graph into a second dataflow graph for processing data in accordance with the first computer operations, the second dataflow graph including a plurality of second nodes representing second computer operations, with at least one of the second nodes representing one or more imperative operations that implement the logic specified by the declarative operation, where the one or more imperative operations are unrepresented by the first nodes in the first dataflow graph.Type: GrantFiled: April 30, 2020Date of Patent: February 28, 2023Assignee: Ab Initio Technology LLCInventors: Ian Schechter, Garth Dickie
-
Patent number: 11593369Abstract: One method includes receiving a database query, receiving information about a database table in data storage populated with data elements, producing a structural representation of the database table that includes a formatted data organization reflective of the database table and is absent the data elements of the database table, and providing the structural representation and the database query to a plan generator capable of producing a query plan representing operations for executing the database query on the database table. Another method includes receiving a query plan from a plan generator, the plan representing operations for executing a database query on a database table, and producing a dataflow graph from the query plan, wherein the dataflow graph includes at least one node that represents at least one operation represented by the query plan, and includes at least one link that represents at least one dataflow associated with the query plan.Type: GrantFiled: April 25, 2017Date of Patent: February 28, 2023Assignee: Ab Initio Technology LLCInventors: Ian Schechter, Glenn John Allin, J. Skeffington Wholey
-
Patent number: 11561993Abstract: A data processing system for producing a subset of data from a plurality of data sources, including: memory storing a plurality of data sources to be represented in an editor interface; a data structure modification module that selects a plurality of data sources to be represented in an editor interface and generates a subset of data included in the plurality of data sources; memory that stores the selected data structures included in the subset, with at least one of the stored data structures including the one or more modified attributes of the one or more respective fields; rendering module that displays, in the editor interface, representations of the stored data structures; and a segmentation modules that segments a plurality of received data records.Type: GrantFiled: October 18, 2018Date of Patent: January 24, 2023Assignee: Ab Initio Technology LLCInventors: Trevor Murphy, Oded Ravid
-
Patent number: 11531775Abstract: A method includes automatically determining a component of a security label for each first record in a first table of a database having multiple tables, including: identifying a second record related to the first record according to a foreign key relationship; identifying a component of the security label for the second record; and assigning a value for the component of the security label for the first record based on the identified component of the security label for the second record. The method includes storing the determined security label in the record.Type: GrantFiled: November 5, 2015Date of Patent: December 20, 2022Assignee: Ab Initio Technology LLCInventor: Christopher J. Winters
-
Publication number: 20220374413Abstract: A data processing system configured to perform: obtaining a first data lineage representing relationships among physical data elements, the first data lineage being generated at least in part by performing at least one of: (a) analyzing source code of at least one computer program configured to access the physical data elements; and (b) analyzing information obtained during runtime of the at least one computer program; obtaining, based on user input, a second data lineage representing relationships among business data elements; obtaining an association between at least some of the physical data elements of the first data lineage and at least some of the business data elements of the second data lineage; and generating, based on the association between the physical data elements and the business data elements, an indication of agreement or discrepancy between the first data lineage and the second data lineage.Type: ApplicationFiled: January 14, 2022Publication date: November 24, 2022Applicant: Ab Initio Technology LLCInventors: Joel Gould, Dusan Radivojevic
-
Patent number: 11487732Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for database key identification. One of the methods includes receiving an identification of a first field in a first data set, the first data set including records. The method includes identifying a set of values, the set including, for each record, a value associated with the field. The method includes generating a filter mask based on the set of values, where application of the filter mask is capable of determining that a given value is not in the set of values. The method includes receiving a second data set including a second field, the second data set including records. The method includes determining a count of a number of records in the second data set having a value associated with the second field that passes the filter mask. The method also includes storing the count in a profile.Type: GrantFiled: January 16, 2014Date of Patent: November 1, 2022Assignee: Ab Initio Technology LLCInventor: Timothy Spencer Bush
-
Patent number: 11487529Abstract: A computer-implemented method for integrating client portals of underlying data processing applications through a shared log record, including: storing one or more log records that are each shared by the process management application and the version control application; receiving instructions through a user interface that integrates, through the shared one or more log records, the process management client portal with the version control client portal; in response to the receiving of the instructions, executing the received instructions, the executing of the received instructions including: selecting, by the version control application, a particular version of the rule from the multiple versions of the rule stored in the system storage; and transitioning, by the process management application, the particular version of the rule from the first state of the plurality of states to the second, different state of the plurality of states.Type: GrantFiled: June 25, 2020Date of Patent: November 1, 2022Assignee: Ab Initio Technology LLCInventors: Scott Studer, Joel Gould, Amit Weisman
-
Patent number: 11487534Abstract: A method for analyzing a computer program ecosystem includes performing a static analysis, including identifying static dependencies among elements of the ecosystem based on values of parameters in one or more parameter sets associated with the ecosystem, the elements of the ecosystem including the computer programs of the ecosystem and data resources associated with the computer programs. The method includes performing a runtime analysis, including identifying elements of the ecosystem that were utilized during execution of the ecosystem to process data records. The method includes performing a schedule analysis, including identifying a computer program of the ecosystem that has a schedule dependency from another computer program of the ecosystem. The method includes identifying a subset of the elements of the ecosystem as an ecosystem unit based on the results of the static, runtime, and schedule analyses. The method includes migrating the ecosystem unit, testing the ecosystem unit, or both.Type: GrantFiled: May 3, 2021Date of Patent: November 1, 2022Assignee: Ab Initio Technology LLCInventors: John Joyce, Marshall A. Isman, Sam Kendall
-
Patent number: 11475023Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for impact analysis. One of the methods includes receiving information about at least two logical datasets, the information identifying, for each logical dataset, a field in that logical dataset and format information about that field. The method includes receiving information about a transformation identifying a first logical dataset from which the transformation is to receive data and a second logical dataset to which the transformed data is provided. The method includes receiving one or more proposed changes to at least one of the fields. The method includes analyzing the proposed changes based on information about the transformation and information about the first logical dataset and the second logical dataset. The method includes calculating metrics of the proposed change based on the analysis. The method also includes storing information about the metrics.Type: GrantFiled: November 26, 2018Date of Patent: October 18, 2022Assignee: Ab Initio Technology LLCInventors: Joel Gould, Scott Studer
-
Patent number: 11455229Abstract: A method for displaying differences between a first executable dataflow graph and a second executable dataflow graph includes comparing a specification of the first executable dataflow graph and a specification of the second executable dataflow graph, including at least one of identifying a particular node or link of the first dataflow graph that does not correspond to any node or link of the second dataflow graph; and identifying a first node or link of the first dataflow graph that corresponds to a second node or link of the second dataflow graph, and identifying a difference between the first node or link and the second node or link. The method includes formulating and displaying a graphical representation of at least some of the nodes or links of the first dataflow graph or the second dataflow graph, the graphical representation including a graphical indicator of at least one of the identified particular node or link the identified difference between the first node or link and the second node or link.Type: GrantFiled: October 9, 2020Date of Patent: September 27, 2022Assignee: Ab Initio Technology LLCInventors: Ilya Rozenberg, Adam Weiss
-
Patent number: 11423083Abstract: A method performed by a computer system including: accessing a specification that specifies a plurality of modules to be implemented by the computer program for processing the one or more values of the one or more fields in the structured data item; transforming the specification into the computer program that implements the plurality of modules, wherein the transforming includes: for each of one or more first modules of the plurality of modules: identifying one or more second modules of the plurality of modules that each receive input that is at least partly based on an output of the first module; and formatting an output data format of the first module such that the first module outputs only one or more values of one or more fields of the structured data item.Type: GrantFiled: October 27, 2017Date of Patent: August 23, 2022Assignee: Ab Initio Technology LLCInventors: Jonah Egenolf, Marshall A. Isman, Frederic Wild