Abstract: Data consistency across replicas in a cluster of nodes is maintained by continuously validating local data ranges and repairing any inconsistencies found. Local data ranges are split into segments and prioritized. After a segment is selected for validation, a hash value of a portion of the segment is compared to a hash value from other nodes storing replicas of that data. If the hash values match then the data is consistent. If the hash values do not match then the data is not consistent and whichever data is most current according to their timestamps is considered correct. If the local node data is correct, it is communicated to the replica nodes so they can be updated. If the local node data is not correct, then data from the replica nodes is correct and is used to update the data in the local node. An alternative, incremental validation approach improves efficiency.
Abstract: Systems and methods are described herein for efficiently updating a secondary index associated with a log-structured merge-tree (LSM) database. A Global approximate member query (AMQ) Filter is queried to determine whether a primary key, retrieved from a list of LSM database updates, already exists in the LSM database. If the primary key does not already exist in the LSM database then read-before-write and delete operations, typically performed with known approaches, do not need to be performed on the secondary index in order to update the secondary index, thereby avoiding significant additional computer processing and input/output operations.
Type:
Grant
Filed:
April 25, 2018
Date of Patent:
September 15, 2020
Assignee:
DataStax
Inventors:
Jason John Rutherglen, Ariel David Weisberg
Abstract: Data consistency across replicas in a cluster of nodes is maintained by continuously validating local data ranges and repairing any inconsistencies found. Local data ranges are split into segments and prioritized. After a segment is selected for validation, a hash value of a portion of the segment is compared to a hash value from other nodes storing replicas of that data. If the hash values match then the data is consistent. If the hash values do not match then the data is not consistent and whichever data is most current according to their timestamps is considered correct. If the local node data is correct, it is communicated to the replica nodes so they can be updated. If the local node data is not correct, then data from the replica nodes is correct and is used to update the data in the local node.
Abstract: Various operations, functionalities and systems are described herein for backing up one or more node to an offsite location, restoring the one or more node from the offsite location, restoring the one or more node to a point-in-time (PIT) from the offsite location, cloning the one or more node from the offsite location, and cloning the one or more node to a PIT from the offsite location. Example operating contexts include one or more cluster of nodes running a NoSQL (Not only Structured Query Language) distributed database and backup, restore and/or cloning on those one or more cluster of nodes.
Type:
Grant
Filed:
January 28, 2015
Date of Patent:
September 3, 2019
Assignee:
DataStax
Inventors:
Nicholas M. Bailey, Michael Davis Bulman, Maxim Barnash, Peter James Halliday
Abstract: Various operations, functionalities and systems are described herein for backing up one or more node to an offsite location, restoring the one or more node from the offsite location, restoring the one or more node to a point-in-time (PIT) from the offsite location, cloning the one or more node from the offsite location, and cloning the one or more node to a PIT from the offsite location. Example operating contexts include one or more cluster of nodes running a NoSQL (Not only Structured Query Language) distributed database and backup, restore and/or cloning on those one or more cluster of nodes.
Type:
Grant
Filed:
January 28, 2015
Date of Patent:
September 3, 2019
Assignee:
DataStax
Inventors:
Nicholas M. Bailey, Michael Davis Bulman, Maxim Barnash, Peter James Halliday
Abstract: Fault tolerant querying of data distributed across multiple nodes is accomplished by each node determining and reporting its own health status and indexing status to the other nodes in the cluster via a gossip protocol. A coordinator node then prioritizes replica nodes based on the received status of the other nodes and sends query requests to those nodes based on the prioritization. Should a node fail to provide an response to a query request, further query requests are sent to a next highest priority replica node containing the relevant data. This results in improved query performance by avoiding busy nodes and further provides a fault tolerant approach to data queries.
Type:
Grant
Filed:
September 16, 2015
Date of Patent:
February 19, 2019
Assignee:
DataStax
Inventors:
Sergio Bossa, Caleb William Rackliffe, Edward de Oliveira Ribeiro