Patents Assigned to Ab Initio Technology LLC
-
Patent number: 10817271Abstract: A dependency analyzer for a data processing system comprising at least one computer hardware processor configured to generate dependency information among variables, which may appear in any of multiple programs written in different source languages. The data processing system may parse each program, regardless of the language in which the module was written. Parsed information about each program may be recorded in a first-type data structure and then may be converted to a format representing dependencies among variables. Dependency information for each of the plurality of programs may be expressed as a combination of language independent constructs, which may be processed together, to generate dependency information for the data processing system. The dependency information may be recorded in a dependency data structure and further used for operations, such as data quality checking and change control for the data processing program.Type: GrantFiled: July 15, 2019Date of Patent: October 27, 2020Assignee: Ab Initio Technology LLCInventors: Christophe Berg, David Clemens
-
Patent number: 10802945Abstract: A method for displaying differences between a first executable dataflow graph and a second executable dataflow graph includes comparing a specification of the first executable dataflow graph and a specification of the second executable dataflow graph, including at least one of identifying a particular node or link of the first dataflow graph that does not correspond to any node or link of the second dataflow graph; and identifying a first node or link of the first dataflow graph that corresponds to a second node or link of the second dataflow graph, and identifying a difference between the first node or link and the second node or link. The method includes formulating and displaying a graphical representation of at least some of the nodes or links of the first dataflow graph or the second dataflow graph, the graphical representation including a graphical indicator of at least one of the identified particular node or link the identified difference between the first node or link and the second node or link.Type: GrantFiled: May 5, 2017Date of Patent: October 13, 2020Assignee: Ab Initio Technology LLCInventors: Ilya Rozenberg, Adam Weiss
-
Patent number: 10782960Abstract: A computer-implemented method for integrating client portals of underlying data processing applications through a shared log record, including: storing one or more log records that are each shared by the process management application and the version control application; receiving instructions through a user interface that integrates, through the shared one or more log records, the process management client portal with the version control client portal; in response to the receiving of the instructions, executing the received instructions, the executing of the received instructions including: selecting, by the version control application, a particular version of the rule from the multiple versions of the rule stored in the system storage; and transitioning, by the process management application, the particular version of the rule from the first state of the plurality of states to the second, different state of the plurality of states.Type: GrantFiled: March 9, 2018Date of Patent: September 22, 2020Assignee: Ab Initio Technology LLCInventors: Scott Studer, Joel Gould, Amit Weisman
-
Patent number: 10776325Abstract: An approach to parallel access of data from a distributed filesystem provides parallel access to one or more named units (e.g., files) in the filesystem by creating multiple parallel data streams such that all the data of the desired units is partitioned over the multiple streams. In some examples, the multiple streams form multiple inputs to a parallel implementation of a computation system, such as a graph-based computation system, dataflow-based system, and/or a (e.g., relational) database system.Type: GrantFiled: November 26, 2013Date of Patent: September 15, 2020Assignee: Ab Initio Technology LLCInventors: Ann M. Wollrath, Bryan Phil Douros, Marshall Alan Isman, Timothy Wakeling
-
Patent number: 10769122Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for specifying logical rules, one of the methods includes defining a logical rule, the logical rule applying operations based on a term. The method includes defining a mapping between fields and terms, the mapping including a mapping between a field and the term. The method includes storing the logical rule in association with the term. The method also includes applying the logical rule to data identified by the first field where respective fields are assigned to respective terms.Type: GrantFiled: March 13, 2014Date of Patent: September 8, 2020Assignee: Ab Initio Technology LLCInventors: Joel Gould, Roy Procops
-
Publication number: 20200234242Abstract: Techniques for using finite state machines (FSMs) to implement workflows in a data processing system comprising at least one data store storing data objects and a workflow management system (WMS). The WMS is configured to perform: determining a current value of an attribute of a first data object by accessing the current value in the at least one data store; identifying, using the current value and metadata specifying relationships among at least some of the data objects, an actor authorized to perform a workflow task for the first data object; generating a GUI through which the actor can provide the input that the workflow task is to be performed; and in response to receiving, from the actor and through the GUI, input specifying that the workflow task is to be performed: performing the workflow task; and updating the current workflow state of the first FSM to a second workflow state.Type: ApplicationFiled: January 22, 2020Publication date: July 23, 2020Applicant: Ab Initio Technology LLCInventors: Robert Parks, Anthony Yeracaris, Dusan Radivojevic
-
Patent number: 10719511Abstract: Profiling data includes accessing multiple collections of records to store quantitative information for each particular collection including, for at least one selected field of the records in the particular collection, a corresponding list of value count entries, each including a value appearing in the selected field and a count of the number of records in which the value appears. Processing the quantitative information of two or more collections includes: merging the value count entries of corresponding lists for at least one field from each of a first collection and a second collection to generate a combined list of value count entries, and aggregating value count entries of the combined list of value count entries to generate a list of distinct field value entries identifying a distinct value and including information quantifying a number of records in which the distinct value appears for each of the two or more collections.Type: GrantFiled: February 13, 2017Date of Patent: July 21, 2020Assignee: Ab Initio Technology LLCInventor: Arlen Anderson
-
Patent number: 10705877Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for workload automation and job scheduling information. One of the methods includes obtaining job dependency information, the job dependency information specifying an order of execution of a plurality of jobs. The method also includes obtaining data lineage information that identifies dependency relationships between data stores and transformation, wherein at least one transformation accepts data from a first data store and produces data for a second data store. The method also includes creating links between the job dependency information and the data lineage information. The method also includes determining an impact of a change in a planned execution of an application of the plurality of applications based on the job dependency information, the created links, and the data lineage information.Type: GrantFiled: August 27, 2014Date of Patent: July 7, 2020Assignee: Ab Initio Technology LLCInventors: Harry Michael Wolfson, Joel Gould, Anthony Yeracaris, Tim Wakeling
-
Patent number: 10705807Abstract: A method includes analyzing, by a processor, a first version of a computer program. The analyzing includes identifying a first data processing element included in the first version of the computer program. The first data processing element references a first data source external to the first version of the computer program. The method includes generating a data source element that represents a second data source different from the first data source. The method includes generating a second version of the computer program. The second version of the computer program includes the generated data source element and a second data processing element that is based on the first data processing element. In the second version of the computer program, the second data processing element references the generated data source element.Type: GrantFiled: January 29, 2018Date of Patent: July 7, 2020Assignee: Ab Initio Technology LLCInventors: Marshall A. Isman, John Joyce
-
Patent number: 10685030Abstract: Presenting a diagram indicating relationships among data items stored in a data management system includes: receiving a request that identifies a first data item stored in the data management system from a user interface; retrieving stored configuration information that includes a plurality of selection specifications for selecting data items in the data management system that are related to a given data item of a predetermined type, where each selection specification is associated with a different respective predetermined type; querying the data management system to identify a set of one or more data items according to a selection specification from the configuration information that is associated with a type of the first data item; for each of multiple returned data items in the identified set, querying the data management system to determine whether additional data items are identified according to a selection specification from the configuration information that is associated with a type of the returned dType: GrantFiled: March 1, 2018Date of Patent: June 16, 2020Assignee: Ab Initio Technology LLCInventors: Jeffrey Brainerd, Alan Morse
-
Patent number: 10671576Abstract: Managing database transactions in a distributed database system includes: maintaining, at a first node, a first plurality of records of transactions, each associated with a transaction and including a start time of the transaction and a start time of an oldest transaction that was active at the start time of the transaction; maintaining, at a second node, a second plurality of records of transactions, including records of completed transactions associated with the second node, each including a transaction start time and a transaction end time; receiving at the second node, a message from the first node including a start time of an oldest transaction that was active at the transaction start time of the oldest currently active transaction in the system; and removing, from the second plurality of records, any records of completed transactions with a transaction end time occurring before the start time of the oldest transaction.Type: GrantFiled: July 5, 2016Date of Patent: June 2, 2020Assignee: Ab Initio Technology LLCInventors: Bryan Phil Dourus, Stephen A. Revilak
-
Patent number: 10671669Abstract: A specification including a description of a first directed graph including a first plurality of components interconnected by a first set of one or more directed links is received. A graph interface is formed for the first plurality of components including: forming a first interface element of the graph interface, the first interface element being associated with a first port of a first component of the first number of components, and configuring one or more properties of the first interface element such that the first port of the first component is consistent with the one or more properties of the first interface element. A first implementation of the graph interface is formed including the first number of components, the forming including forming a first correspondence between the first interface element and the first port of the first component of the first number of components.Type: GrantFiled: December 20, 2016Date of Patent: June 2, 2020Assignee: Ab Initio Technology LLCInventors: Victor T. Abaya, Russell L. Bryan, Brond Larson, Carl Offner, Daniel J. Teven
-
Patent number: 10657134Abstract: A computer-implemented method for executing a query on data items located at different places in a stream of near real-time data to provide near-real time intermediate results for the query, as the query is being executed, the method including: from time to time, executing, by one or more computer systems, the query on two or more of the data items located at different places in the stream, with the two or more data items being accessed in near real-time with respect to each of the two or more data items; generating information indicative of results of executing the query; and as the query continues being executed, generating intermediate results of query execution by aggregating the results with prior results of executing the query on data items that previously appeared in the stream of near real-time data; and transmitting to one or more client devices the intermediate results of query execution, prior to completion of execution of the query.Type: GrantFiled: August 5, 2015Date of Patent: May 19, 2020Assignee: Ab Initio Technology LLCInventors: Rajesh Gadodia, Joseph Skeffington Wholey, III
-
Patent number: 10642850Abstract: In a first aspect, a method includes, at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage, executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster, receiving a computer-executable program by the data processing engine, executing at least part of the program by the first instance of the data processing engine, receiving, by the data processing engine, a second portion of data from the external data source, storing the second portion of data other than in HDFS storage, and performing, by the data processing engine, a data processing operation identified by the program using at least the first portion of data and the second portion of data.Type: GrantFiled: February 14, 2017Date of Patent: May 5, 2020Assignee: Ab Initio Technology LLCInventors: Ian Schechter, Tim Wakeling, Ann M. Wollrath
-
Patent number: 10628240Abstract: Processing multiple kinds of event messages in a computing system includes storing the event messages as records associated with event messages. Each event message includes a timestamp and the records include a field indicating a target delivery time for an event result for the event message, the target delivery time being determined according to a kind of the event message. The event messages are processed to deliver event results based on information in the event messages and the target delivery times. Event messages are prioritized to deliver event results according to information indicating priority. A target delivery time is computed for event messages having a same priority based on fixed delays relative to their timestamps. Event results are delivered based on a comparison of their target delivery times to a clock time.Type: GrantFiled: December 15, 2017Date of Patent: April 21, 2020Assignee: Ab Initio Technology LLCInventor: Craig W. Stanfill
-
Patent number: 10606827Abstract: Distributed processing of a data collection includes receiving information for configuring a distributed processing system. A first configuration of components is formed including sources of data elements and workers configured to process data elements, distributed among computing resources. Each data element includes a partition value that identifies a subset of the workers according to a partition rule. Data elements are accepted from the sources for a first part of the data collection in a first processing epoch and the data elements are routed through the first configuration. After accepting a first part of the data collection, change of configuration is initiated to a second configuration. A succession of two or more transitions between configurations of components is performed to a succession of modified configurations, a last of which corresponds to the second configuration. Further data elements are accepted from sources of the second configuration in a second processing epoch.Type: GrantFiled: May 17, 2017Date of Patent: March 31, 2020Assignee: Ab Initio Technology LLCInventors: Jeffrey Newbern, Craig W. Stanfill
-
Patent number: 10601890Abstract: A computing system includes nodes executing data processing programs that each process at least one stream of data units. A data storage system stores shared data accessible by at least two of the programs. Processing at least one stream using a first data processing program includes: processing a first stream of data units that includes multiple subsets of contiguous data units; initiating termination of processing within the first data processing program, between processing a first subset of contiguous data units and processing a second subset of contiguous data units adjacent to the first subset of contiguous data units within the first stream of data units; durably storing at least some changes to the shared data caused by processing the first subset of contiguous data units after determining that the termination of processing within the first data processing program has completed; and resuming processing within the first data processing program.Type: GrantFiled: January 13, 2017Date of Patent: March 24, 2020Assignee: Ab Initio Technology LLCInventors: Bryan Phil Douros, Craig W. Stanfill, Joseph Skeffington Wholey, III
-
Patent number: 10599475Abstract: Information representative of a graph-based program specification has a plurality of components, each of which corresponds to a task, and directed links between ports of said components. A program corresponding to said graph-based program specification is executed. A first component includes a first data port, a first control port, and a second control port. Said first data port is configured to receive data to be processed by a first task corresponding to said first component, or configured to provide data that was processed by said first task corresponding to said first component. Executing a program corresponding to said graph-based program specification includes: receiving said first control information at said first control port, in response to receiving said first control information, determining whether or not to invoke said first task, and after receiving said first control information, providing said second control information from said second control port.Type: GrantFiled: August 30, 2018Date of Patent: March 24, 2020Assignee: Ab Initio Technology LLCInventors: Craig W. Stanfill, Richard Shapiro, Adam Weiss, Andrew F. Roberts, Joseph Skeffington Wholey, III, Joel Gould
-
Patent number: 10579753Abstract: A method implemented by a data processing system for processing data items of a stream of data items, including: accessing a specification that represents the executable logic, wherein a state of the specification for a particular value of the key specifies one or more portions of the executable logic that are executable in that state; receiving, over an input device or port, data items of a stream of data; for a first one of the data items of the stream, identifying a first state of the specification for a value of the key associated with that first one of the data items; processing, by the data processing system, the first one of the data items according to one or more portions of executable logic that are represented in the specification as being associated with the first state.Type: GrantFiled: December 12, 2016Date of Patent: March 3, 2020Assignee: Ab Initio Technology LLCInventors: Joel Gould, Scott Studer, Craig W. Stanfill
-
Patent number: 10572511Abstract: Received data records, each including one or more values in one or more fields, are processed to identify a matched data cluster. The processing includes: for selected data records, generating a query from one or more values; identifying one or more candidate data records from the received data records using the query; determining whether or not the selected data record satisfies a cluster membership criterion for at least one candidate data cluster of one or more existing data clusters containing the candidate records; and selecting the matched data cluster from among one or more candidate data clusters based at least in part on a growth criterion for the candidate data clusters, or initializing the matched data cluster with the selected data record if the selected data record does not satisfy a cluster membership criterion for any of the existing data clusters or based on a result of the growth criterion.Type: GrantFiled: June 2, 2016Date of Patent: February 25, 2020Assignee: Ab Initio Technology LLCInventors: Arlen Anderson, Kamil Trojan