Patents Assigned to StreamSets, Inc.
  • Patent number: 11200208
    Abstract: Systems and methods herein describe accessing an original change data capture (CDC) dataset comprising information describing changes to a source database, the original CDC dataset comprising a plurality of entries; identifying a first entry of the plurality of entries comprising a primary-key, a first operation and entry data; identifying a set of entries in the plurality of entries that includes the primary-key; comparing the first operation of the first entry with a second operation of a second entry in the set of entries; updating the first operation and the entry data based on the comparison; generating a new entry based on the updating of the first operation and the entry data; storing the new entry in a consolidated CDC dataset; and applying the consolidated CDC dataset to a target database.
    Type: Grant
    Filed: January 9, 2020
    Date of Patent: December 14, 2021
    Assignee: StreamSets, Inc.
    Inventor: Alejandro Humberto Abdelnur
  • Patent number: 11048673
    Abstract: In various example embodiments, a system, computer readable medium and method for schema update engine dynamically updating a target data storage system. Incoming data records are received. A front-end schema of the incoming data records is identified. The front-end schema and the current target schema are compared. Based on identifying a difference between the front-end schema and the current target schema, the current target schema is updated in order to be identical to the front-end schema. The current target data file is closed and the incoming data records are stored in a new target data file according to the updated target schema.
    Type: Grant
    Filed: June 15, 2018
    Date of Patent: June 29, 2021
    Assignee: StreamSets, Inc.
    Inventors: Arvind Prabhakar, Alejandro Abdelnur, Madhukar Devaraju
  • Patent number: 10678660
    Abstract: In various example embodiments, a system, computer-readable medium and method to detect and dynamically correct a transformation drift in a data pipeline, the method comprising detecting a change in a transformation performed by an upstream subsystem of the data pipeline on a data field of an output dataset of the upstream subsystem; classifying the data field as an impacted data field; identifying, based on the topology information, a downstream subsystem of the data pipeline downstream of the upstream subsystem; identifying an input dataset of the downstream subsystem including the impacted data field; and performing a corrective transformation on the impacted data field of the input dataset of the downstream subsystem
    Type: Grant
    Filed: June 26, 2018
    Date of Patent: June 9, 2020
    Assignee: StreamSets, Inc.
    Inventor: Rupal Jatinkumar Shah