Patents by Inventor Dhrubajyoti Borthakur

Dhrubajyoti Borthakur has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Querying a specified data storage layer of a data storage system

Patent number: 10387416

Abstract: Technology is disclosed for retrieving data from a specific storage layer of a storage system (“the technology”). A query application programming interface (API) is provided that allows an application to specify a storage layer on which the query should be executed. The query API can be used in a multi-threaded environment which employs a combination of fast threads and slow threads to serve read/write requests from applications. The fast threads are configured to query on a first set of storage layers, e.g., storage layers in a primary storage, while the slow threads are configured to query on a second set of storage layers, e.g., storage layers in a secondary storage. If a fast thread does not find the requested data in the first set, the request is transferred to a slow thread and the fast thread is allocated to another request while the slow thread is serving the current request.

Type: Grant

Filed: November 14, 2013

Date of Patent: August 20, 2019

Assignee: Facebook, Inc.

Inventors: Mayank Agarwal, Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Haobo Xu
Atomic update operations in a data storage system

Patent number: 10346381

Abstract: Technology is disclosed for performing atomic update operations in a storage system (“the technology”). The technology can receive an update command to update a value associated with a key stored in the storage system as a function of an input value; store the input value in a log stored at the storage system but not updating the value stored in the storage system; and update the value associated with the key with the received input values value based on the a function to generate an updated value, the updating occurring asynchronously with respect to receiving the update command.

Type: Grant

Filed: November 14, 2013

Date of Patent: July 9, 2019

Assignee: Facebook, Inc.

Inventors: Deon Chris Nicholas, Haobo Xu, Dhrubajyoti Borthakur
Processing a file system operation in a distributed file system

Patent number: 9904689

Abstract: Processing a file system operation is disclosed. An indication of a desired operation of a distributed file system is received. A metadata node for the desired operation is identified. It is indicated to the identified metadata node to process the desired operation. In the event the identified metadata node becomes not fully functional before the processing by the identified metadata node is confirmed, the distributed file system is analyzed to determine whether to indicate again to process the desired operation.

Type: Grant

Filed: July 13, 2012

Date of Patent: February 27, 2018

Assignee: Facebook, Inc.

Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
LSM cache

Patent number: 9846711

Abstract: A variety of methods for improving efficiency in a database system are provided. In one embodiment, a method may comprise: generating multiple levels of data according to how recently the data have been updated, whereby most recently updated data are assigned to the newest level; storing each level of data in a specific storage tier; splitting data stored in a particular storage tier into two or more groups according to access statistics of each specific data; during compaction, storing data from different groups in separate data blocks of the particular storage tier; and when a particular data in a specific data block is requested, reading the specific data block into a low-latency storage tier.

Type: Grant

Filed: December 28, 2012

Date of Patent: December 19, 2017

Assignee: Facebook, Inc.

Inventors: Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Jeffrey Rothschild
Automated failover of a metadata node in a distributed file system

Patent number: 9607001

Abstract: Switching an active metadata node is disclosed. An indication that a standby metadata node of a distributed file system should replace an active metadata node of the distributed file system as a new active metadata node of the distributed file system is received. The standby metadata node is included in a server. A request that indicates that the standby metadata node would like to become an exclusive metadata node writer of a transaction log is sent. A confirmation that the standby metadata node is the exclusive metadata node writer of the transaction log is received. Based at least in part on the confirmation, an update that the standby metadata node has become the new active metadata node of the distributed file system is provided.

Type: Grant

Filed: July 13, 2012

Date of Patent: March 28, 2017

Assignee: Facebook, Inc.

Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
Use of incremental checkpoints to restore user data stream processes

Patent number: 9471436

Abstract: A method and system on failure recovery in a storage system are disclosed. In the storage system, user data streams (e.g., log data) are collected by a scribeh system. The scribeh system may include a plurality of Calligraphus servers, HDFS and Zookeeper. The Calligraphus servers may shard the user data streams based on keys (e.g., category and bucket pairs) and stream the user data streams to Puma nodes. Sharded user data streams may be aggregated according to the keys in memory of a specific Puma node. Periodically, aggregated user data streams cached in memory of the specific Puma node, together with a Incremental checkpoint, are persisted to HBase. When a specific process on the specific Puma node fails, Ptail retrieves the Incremental checkpoint from HBase and then restores the specific process by requesting user data streams processed by the specific process from the scribeh system according to the Incremental checkpoint.

Type: Grant

Filed: April 23, 2013

Date of Patent: October 18, 2016

Inventors: Samuel Rash, Dhrubajyoti Borthakur, Prakash Khemani, Zheng Shao
Scalable data structures

Patent number: 9411840

Abstract: The technology is directed to providing sequential access to data using scalable data structures. In some embodiments, the scalable data structures include a first data structure, e.g., hash map, and a second data structure, e.g., tree data structure (“tree”). The technology receives multiple key-value pairs representing data associated with an application. A key includes a prefix and a suffix. While the suffixes of the keys are distinct, some of the keys can have the same prefix. The technology stores the keys having the same prefix in a tree, and stores the root node of the tree in the first data structure. To retrieve values of a set of input keys with a given prefix, the technology retrieves a root node of a tree corresponding to the given prefix from the first data structure using the given prefix, and traverses the tree to obtain the values in a sequence.

Type: Grant

Filed: April 10, 2014

Date of Patent: August 9, 2016

Assignee: Facebook, Inc.

Inventors: Wei Chen, Dhrubajyoti Borthakur
SCALABLE DATA STRUCTURES

Publication number: 20150293958

Abstract: The technology is directed to providing sequential access to data using scalable data structures. In some embodiments, the scalable data structures include a first data structure, e.g., hash map, and a second data structure, e.g., tree data structure (“tree”). The technology receives multiple key-value pairs representing data associated with an application. A key includes a prefix and a suffix. While the suffixes of the keys are distinct, some of the keys can have the same prefix. The technology stores the keys having the same prefix in a tree, and stores the root node of the tree in the first data structure. To retrieve values of a set of input keys with a given prefix, the technology retrieves a root node of a tree corresponding to the given prefix from the first data structure using the given prefix, and traverses the tree to obtain the values in a sequence.

Type: Application

Filed: April 10, 2014

Publication date: October 15, 2015

Inventors: Wei Chen, Dhrubajyoti Borthakur
QUERYING A SPECIFIED DATA STORAGE LAYER OF A DATA STORAGE SYSTEM

Publication number: 20150134692

Abstract: Technology is disclosed for retrieving data from a specific storage layer of a storage system (“the technology”). A query application programming interface (API) is provided that allows an application to specify a storage layer on which the query should be executed. The query API can be used in a multi-threaded environment which employs a combination of fast threads and slow threads to serve read/write requests from applications. The fast threads are configured to query on a first set of storage layers, e.g., storage layers in a primary storage, while the slow threads are configured to query on a second set of storage layers, e.g., storage layers in a secondary storage. If a fast thread does not find the requested data in the first set, the request is transferred to a slow thread and the fast thread is allocated to another request while the slow thread is serving the current request.

Type: Application

Filed: November 14, 2013

Publication date: May 14, 2015

Inventors: Mayank Agarwal, Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Haobo Xu
ATOMIC UPDATE OPERATIONS IN A DATA STORAGE SYSTEM

Publication number: 20150134602

Abstract: Technology is disclosed for performing atomic update operations in a storage system (“the technology”). The technology can receive an update command to update a value associated with a key stored in the storage system as a function of an input value; store the input value in a log stored at the storage system but not updating the value stored in the storage system; and update the value associated with the key with the received input values value based on the a function to generate an updated value, the updating occurring asynchronously with respect to receiving the update command.

Type: Application

Filed: November 14, 2013

Publication date: May 14, 2015

Inventors: Deon Chris Nicholas, Haobo Xu, Dhrubajyoti Borthakur
INCREMENTAL CHECKPOINTS

Publication number: 20140317448

Abstract: A method and system on failure recovery in a storage system are disclosed. In the storage system, user data streams (e.g., log data) are collected by a scribeh system. The scribeh system may include a plurality of Calligraphus servers, HDFS and Zookeeper. The Calligraphus servers may shard the user data streams based on keys (e.g., category and bucket pairs) and stream the user data streams to Puma nodes. Sharded user data streams may be aggregated according to the keys in memory of a specific Puma node. Periodically, aggregated user data streams cached in memory of the specific Puma node, together with a Incremental checkpoint, are persisted to HBase. When a specific process on the specific Puma node fails, Ptail retrieves the Incremental checkpoint from HBase and then restores the specific process by requesting user data streams processed by the specific process from the scribeh system according to the Incremental checkpoint.

Type: Application

Filed: April 23, 2013

Publication date: October 23, 2014

Applicant: Facebook, Inc.

Inventors: Samuel Rash, Dhrubajyoti Borthakur, Prakash Khemani, Zheng Shao
DATA STREAM SPLITTING FOR LOW-LATENCY DATA ACCESS

Publication number: 20140214752

Abstract: Techniques for facilitating and accelerating log data processing by splitting data streams are disclosed herein. The front-end clusters generate large amount of log data in real time and transfer the log data to an aggregating cluster. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further splits the log data into a plurality of data streams so that the data streams are sent to a receiving application in parallel. In one embodiment, the log data are randomly split to ensure the log data are evenly distributed in the split data streams. In another embodiment, the application that receives the split data streams determines how to split the log data.

Type: Application

Filed: January 31, 2013

Publication date: July 31, 2014

Inventors: Samuel Rash, Dhrubajyoti Borthakur, Zheng Shao, Eric Hwang
LSM CACHE

Publication number: 20140188870

Abstract: A variety of methods for improving efficiency in a database system are provided. In one embodiment, a method may comprise: generating multiple levels of data according to how recently the data have been updated, whereby most recently updated data are assigned to the newest level; storing each level of data in a specific storage tier; splitting data stored in a particular storage tier into two or more groups according to access statistics of each specific data; during compaction, storing data from different groups in separate data blocks of the particular storage tier; and when a particular data in a specific data block is requested, reading the specific data block into a low-latency storage tier.

Type: Application

Filed: December 28, 2012

Publication date: July 3, 2014

Inventors: Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Jeffrey Rothschild
Distributed system for fault-tolerant data storage

Patent number: 8751897

Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, and the L1EC blocks and the data blocks are divided into distribution sets and stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. Upon detecting a data disruption event, an inaccessible data storage location is identified and the elements that were stored at the inaccessible data storage location are reconstructed.

Type: Grant

Filed: October 18, 2013

Date of Patent: June 10, 2014

Assignee: Facebook Inc.

Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
Distributed System for Fault-Tolerant Data Storage

Publication number: 20140047266

Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, and the L1EC blocks and the data blocks are divided into distribution sets and stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. Upon detecting a data disruption event, an inaccessible data storage location is identified and the elements that were stored at the inaccessible data storage location are reconstructed.

Type: Application

Filed: October 18, 2013

Publication date: February 13, 2014

Applicant: Facebook, Inc.

Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
PROCESSING A FILE SYSTEM OPERATION IN A DISTRIBUTED FILE SYSTEM

Publication number: 20140019495

Abstract: Processing a file system operation is disclosed. An indication of a desired operation of a distributed file system is received. A metadata node for the desired operation is identified. It is indicated to the identified metadata node to process the desired operation. In the event the identified metadata node becomes not fully functional before the processing by the identified metadata node is confirmed, the distributed file system is analyzed to determine whether to indicate again to process the desired operation.

Type: Application

Filed: July 13, 2012

Publication date: January 16, 2014

Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
AUTOMATED FAILOVER OF A METADATA NODE IN A DISTRIBUTED FILE SYSTEM

Publication number: 20140019405

Abstract: Switching an active metadata node is disclosed. An indication that a standby metadata node of a distributed file system should replace an active metadata node of the distributed file system as a new active metadata node of the distributed file system is received. The standby metadata node is included in a server. A request that indicates that the standby metadata node would like to become an exclusive metadata node writer of a transaction log is sent. A confirmation that the standby metadata node is the exclusive metadata node writer of the transaction log is received. Based at least in part on the confirmation, an update that the standby metadata node has become the new active metadata node of the distributed file system is provided.

Type: Application

Filed: July 13, 2012

Publication date: January 16, 2014

Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
Distributed system for fault-tolerant data storage

Patent number: 8595586

Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, so long as sufficient undamaged elements are still accessible. The L1EC blocks and the data blocks are divided into distribution sets and these sets are stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. The L2EC blocks enable reconstruction of damaged elements at a data storage location without requiring communication with the other data storage locations.

Type: Grant

Filed: April 25, 2012

Date of Patent: November 26, 2013

Assignee: Facebook, Inc.

Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
Distributed System for Fault-Tolerant Data Storage

Publication number: 20130290805

Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, so long as sufficient undamaged elements are still accessible. The L1EC blocks and the data blocks are divided into distribution sets and these sets are stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. The L2EC blocks enable reconstruction of damaged elements at a data storage location without requiring communication with the other data storage locations.

Type: Application

Filed: April 25, 2012

Publication date: October 31, 2013

Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
System and method for generating extensible file system metadata

Patent number: 8484257

Abstract: A system and method for generating extensible file system metadata. In one embodiment, the system may include a storage device configured to store data and a file system configured to manage access to the storage device and to store file system content. The file system may be further configured to detect a file system content access event, and in response to detecting the file system content access event, to generate a metadata record, where the metadata record is stored in an extensible, self-describing data format.

Type: Grant

Filed: June 7, 2004

Date of Patent: July 9, 2013

Assignee: Symantec Operating Corporation

Inventors: Dhrubajyoti Borthakur, Nur Premo

1 2 3 next