Patents Assigned to StreamSets, Inc.
-
Patent number: 11966316Abstract: Systems and methods herein describe receiving identification from a data pipeline, accessing first data offset information for a first data origin and second data offset information for a second data origin, bisecting the first data origin using the first data offset information, processing the data pipeline with the bisected first data offset information and the second data offset information, receiving a notification indicating a data pipeline status, and causing presentation of the notification on a graphical user interface of a computing device.Type: GrantFiled: November 22, 2022Date of Patent: April 23, 2024Assignee: StreamSets, Inc.Inventor: Hari Shreedharan
-
Patent number: 11947779Abstract: Systems and methods herein describe accessing a data processing pipeline, causing presentation of the data processing pipeline on a graphical user interface of a computing device, receiving a selection of a first user interface element within the graphical user interface, generating a datafile representing the data processing pipeline, submitting the datafile and an application to a software framework using an application programming interface, receiving, from the application, the generated datasets, applying the data operations the data processing pipeline, collecting performance data metrics from the data processing pipeline, and dynamically updating the graphical user interface with the collected performance data metrics.Type: GrantFiled: April 24, 2023Date of Patent: April 2, 2024Assignee: StreamSets, Inc.Inventors: Hari Shreedharan, Arvind Prabhakar
-
Patent number: 11775371Abstract: Systems and methods are directed to remote validation and preview. An example system receives an indication of a portion of the data pipeline to be processed, generates a data pipeline configuration file describing operations in the portion of the data pipeline, causes a software framework to perform operations corresponding to the portion of the data pipeline, receives results of the operations corresponding to the portion of the data pipeline, and causes presentation of the results on a graphical user interface of a computing device.Type: GrantFiled: May 22, 2020Date of Patent: October 3, 2023Assignee: StreamSets, Inc.Inventor: Madhukar Devaraju
-
Patent number: 11734235Abstract: In various example embodiments, a system, computer readable medium and method for schema update engine dynamically updating a target data storage system. Incoming data records are received. A front-end schema of the incoming data records is identified. The front-end schema and the current target schema are compared. Based on identifying a difference between the front-end schema and the current target schema, the current target schema is updated in order to be identical to the front-end schema. The current target data file is closed and the incoming data records are stored in a new target data file according to the updated target schema.Type: GrantFiled: May 27, 2021Date of Patent: August 22, 2023Assignee: StreamSets, Inc.Inventors: Arvind Prabhakar, Alejandro Abdelnur, Madhukar Devaraju
-
Patent number: 11662882Abstract: Systems and methods herein describe accessing a data processing pipeline, causing presentation of the data processing pipeline on a graphical user interface of a computing device; receiving a selection of a first user interface element within the graphical user interface, generating a datafile representing the data processing pipeline, submitting the datafile and an application to a software framework using an application programming interface, receiving, from the application, the generated datasets, applying the data operations the data processing pipeline, collecting performance data metrics from the data processing pipeline, and dynamically updating the graphical user interface with the collected performance data metrics.Type: GrantFiled: April 22, 2020Date of Patent: May 30, 2023Assignee: StreamSets, Inc.Inventors: Hari Shreedharan, Arvind Prabhakar
-
Patent number: 11630840Abstract: Systems and methods herein describe embodiments for handling a data drift. An example system accesses the data pipeline, which is comprised of a plurality of stages. For each stage of the plurality of stages in the data pipeline, the system identifies stage schema fields for processing data in the data pipeline and generates a set of stage schema fields comprising the identified stage schema fields in the stage. In response to detecting an origin stage, the system generates a set of pipeline schema fields, whereby the set of pipeline schema fields comprise a union of the generated sets of stage schema fields. The set of pipeline schema fields are then stored.Type: GrantFiled: May 22, 2020Date of Patent: April 18, 2023Assignee: StreamSets, Inc.Inventor: Hari Shreedharan
-
Patent number: 11526415Abstract: Systems and methods herein describe receiving identification from a data pipeline, accessing first data offset information for a first data origin and second data offset information for a second data origin, bisecting the first data origin using the first data offset information, processing the data pipeline with the bisected first data offset information and the second data offset information, receiving a notification indicating a data pipeline status, and causing presentation of the notification on a graphical user interface of a computing device.Type: GrantFiled: April 22, 2020Date of Patent: December 13, 2022Assignee: StreamSets, Inc.Inventor: Hari Shreedharan
-
Patent number: 11200208Abstract: Systems and methods herein describe accessing an original change data capture (CDC) dataset comprising information describing changes to a source database, the original CDC dataset comprising a plurality of entries; identifying a first entry of the plurality of entries comprising a primary-key, a first operation and entry data; identifying a set of entries in the plurality of entries that includes the primary-key; comparing the first operation of the first entry with a second operation of a second entry in the set of entries; updating the first operation and the entry data based on the comparison; generating a new entry based on the updating of the first operation and the entry data; storing the new entry in a consolidated CDC dataset; and applying the consolidated CDC dataset to a target database.Type: GrantFiled: January 9, 2020Date of Patent: December 14, 2021Assignee: StreamSets, Inc.Inventor: Alejandro Humberto Abdelnur
-
Patent number: 11048673Abstract: In various example embodiments, a system, computer readable medium and method for schema update engine dynamically updating a target data storage system. Incoming data records are received. A front-end schema of the incoming data records is identified. The front-end schema and the current target schema are compared. Based on identifying a difference between the front-end schema and the current target schema, the current target schema is updated in order to be identical to the front-end schema. The current target data file is closed and the incoming data records are stored in a new target data file according to the updated target schema.Type: GrantFiled: June 15, 2018Date of Patent: June 29, 2021Assignee: StreamSets, Inc.Inventors: Arvind Prabhakar, Alejandro Abdelnur, Madhukar Devaraju
-
Patent number: 10678660Abstract: In various example embodiments, a system, computer-readable medium and method to detect and dynamically correct a transformation drift in a data pipeline, the method comprising detecting a change in a transformation performed by an upstream subsystem of the data pipeline on a data field of an output dataset of the upstream subsystem; classifying the data field as an impacted data field; identifying, based on the topology information, a downstream subsystem of the data pipeline downstream of the upstream subsystem; identifying an input dataset of the downstream subsystem including the impacted data field; and performing a corrective transformation on the impacted data field of the input dataset of the downstream subsystemType: GrantFiled: June 26, 2018Date of Patent: June 9, 2020Assignee: StreamSets, Inc.Inventor: Rupal Jatinkumar Shah