Patents by Inventor Dhrubajyoti Borthakur

Dhrubajyoti Borthakur has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10387416
    Abstract: Technology is disclosed for retrieving data from a specific storage layer of a storage system (“the technology”). A query application programming interface (API) is provided that allows an application to specify a storage layer on which the query should be executed. The query API can be used in a multi-threaded environment which employs a combination of fast threads and slow threads to serve read/write requests from applications. The fast threads are configured to query on a first set of storage layers, e.g., storage layers in a primary storage, while the slow threads are configured to query on a second set of storage layers, e.g., storage layers in a secondary storage. If a fast thread does not find the requested data in the first set, the request is transferred to a slow thread and the fast thread is allocated to another request while the slow thread is serving the current request.
    Type: Grant
    Filed: November 14, 2013
    Date of Patent: August 20, 2019
    Assignee: Facebook, Inc.
    Inventors: Mayank Agarwal, Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Haobo Xu
  • Patent number: 10346381
    Abstract: Technology is disclosed for performing atomic update operations in a storage system (“the technology”). The technology can receive an update command to update a value associated with a key stored in the storage system as a function of an input value; store the input value in a log stored at the storage system but not updating the value stored in the storage system; and update the value associated with the key with the received input values value based on the a function to generate an updated value, the updating occurring asynchronously with respect to receiving the update command.
    Type: Grant
    Filed: November 14, 2013
    Date of Patent: July 9, 2019
    Assignee: Facebook, Inc.
    Inventors: Deon Chris Nicholas, Haobo Xu, Dhrubajyoti Borthakur
  • Patent number: 9904689
    Abstract: Processing a file system operation is disclosed. An indication of a desired operation of a distributed file system is received. A metadata node for the desired operation is identified. It is indicated to the identified metadata node to process the desired operation. In the event the identified metadata node becomes not fully functional before the processing by the identified metadata node is confirmed, the distributed file system is analyzed to determine whether to indicate again to process the desired operation.
    Type: Grant
    Filed: July 13, 2012
    Date of Patent: February 27, 2018
    Assignee: Facebook, Inc.
    Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
  • Patent number: 9846711
    Abstract: A variety of methods for improving efficiency in a database system are provided. In one embodiment, a method may comprise: generating multiple levels of data according to how recently the data have been updated, whereby most recently updated data are assigned to the newest level; storing each level of data in a specific storage tier; splitting data stored in a particular storage tier into two or more groups according to access statistics of each specific data; during compaction, storing data from different groups in separate data blocks of the particular storage tier; and when a particular data in a specific data block is requested, reading the specific data block into a low-latency storage tier.
    Type: Grant
    Filed: December 28, 2012
    Date of Patent: December 19, 2017
    Assignee: Facebook, Inc.
    Inventors: Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Jeffrey Rothschild
  • Patent number: 9607001
    Abstract: Switching an active metadata node is disclosed. An indication that a standby metadata node of a distributed file system should replace an active metadata node of the distributed file system as a new active metadata node of the distributed file system is received. The standby metadata node is included in a server. A request that indicates that the standby metadata node would like to become an exclusive metadata node writer of a transaction log is sent. A confirmation that the standby metadata node is the exclusive metadata node writer of the transaction log is received. Based at least in part on the confirmation, an update that the standby metadata node has become the new active metadata node of the distributed file system is provided.
    Type: Grant
    Filed: July 13, 2012
    Date of Patent: March 28, 2017
    Assignee: Facebook, Inc.
    Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
  • Patent number: 9471436
    Abstract: A method and system on failure recovery in a storage system are disclosed. In the storage system, user data streams (e.g., log data) are collected by a scribeh system. The scribeh system may include a plurality of Calligraphus servers, HDFS and Zookeeper. The Calligraphus servers may shard the user data streams based on keys (e.g., category and bucket pairs) and stream the user data streams to Puma nodes. Sharded user data streams may be aggregated according to the keys in memory of a specific Puma node. Periodically, aggregated user data streams cached in memory of the specific Puma node, together with a Incremental checkpoint, are persisted to HBase. When a specific process on the specific Puma node fails, Ptail retrieves the Incremental checkpoint from HBase and then restores the specific process by requesting user data streams processed by the specific process from the scribeh system according to the Incremental checkpoint.
    Type: Grant
    Filed: April 23, 2013
    Date of Patent: October 18, 2016
    Inventors: Samuel Rash, Dhrubajyoti Borthakur, Prakash Khemani, Zheng Shao
  • Patent number: 9411840
    Abstract: The technology is directed to providing sequential access to data using scalable data structures. In some embodiments, the scalable data structures include a first data structure, e.g., hash map, and a second data structure, e.g., tree data structure (“tree”). The technology receives multiple key-value pairs representing data associated with an application. A key includes a prefix and a suffix. While the suffixes of the keys are distinct, some of the keys can have the same prefix. The technology stores the keys having the same prefix in a tree, and stores the root node of the tree in the first data structure. To retrieve values of a set of input keys with a given prefix, the technology retrieves a root node of a tree corresponding to the given prefix from the first data structure using the given prefix, and traverses the tree to obtain the values in a sequence.
    Type: Grant
    Filed: April 10, 2014
    Date of Patent: August 9, 2016
    Assignee: Facebook, Inc.
    Inventors: Wei Chen, Dhrubajyoti Borthakur
  • Publication number: 20150293958
    Abstract: The technology is directed to providing sequential access to data using scalable data structures. In some embodiments, the scalable data structures include a first data structure, e.g., hash map, and a second data structure, e.g., tree data structure (“tree”). The technology receives multiple key-value pairs representing data associated with an application. A key includes a prefix and a suffix. While the suffixes of the keys are distinct, some of the keys can have the same prefix. The technology stores the keys having the same prefix in a tree, and stores the root node of the tree in the first data structure. To retrieve values of a set of input keys with a given prefix, the technology retrieves a root node of a tree corresponding to the given prefix from the first data structure using the given prefix, and traverses the tree to obtain the values in a sequence.
    Type: Application
    Filed: April 10, 2014
    Publication date: October 15, 2015
    Inventors: Wei Chen, Dhrubajyoti Borthakur
  • Publication number: 20150134692
    Abstract: Technology is disclosed for retrieving data from a specific storage layer of a storage system (“the technology”). A query application programming interface (API) is provided that allows an application to specify a storage layer on which the query should be executed. The query API can be used in a multi-threaded environment which employs a combination of fast threads and slow threads to serve read/write requests from applications. The fast threads are configured to query on a first set of storage layers, e.g., storage layers in a primary storage, while the slow threads are configured to query on a second set of storage layers, e.g., storage layers in a secondary storage. If a fast thread does not find the requested data in the first set, the request is transferred to a slow thread and the fast thread is allocated to another request while the slow thread is serving the current request.
    Type: Application
    Filed: November 14, 2013
    Publication date: May 14, 2015
    Inventors: Mayank Agarwal, Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Haobo Xu
  • Publication number: 20150134602
    Abstract: Technology is disclosed for performing atomic update operations in a storage system (“the technology”). The technology can receive an update command to update a value associated with a key stored in the storage system as a function of an input value; store the input value in a log stored at the storage system but not updating the value stored in the storage system; and update the value associated with the key with the received input values value based on the a function to generate an updated value, the updating occurring asynchronously with respect to receiving the update command.
    Type: Application
    Filed: November 14, 2013
    Publication date: May 14, 2015
    Inventors: Deon Chris Nicholas, Haobo Xu, Dhrubajyoti Borthakur
  • Publication number: 20140317448
    Abstract: A method and system on failure recovery in a storage system are disclosed. In the storage system, user data streams (e.g., log data) are collected by a scribeh system. The scribeh system may include a plurality of Calligraphus servers, HDFS and Zookeeper. The Calligraphus servers may shard the user data streams based on keys (e.g., category and bucket pairs) and stream the user data streams to Puma nodes. Sharded user data streams may be aggregated according to the keys in memory of a specific Puma node. Periodically, aggregated user data streams cached in memory of the specific Puma node, together with a Incremental checkpoint, are persisted to HBase. When a specific process on the specific Puma node fails, Ptail retrieves the Incremental checkpoint from HBase and then restores the specific process by requesting user data streams processed by the specific process from the scribeh system according to the Incremental checkpoint.
    Type: Application
    Filed: April 23, 2013
    Publication date: October 23, 2014
    Applicant: Facebook, Inc.
    Inventors: Samuel Rash, Dhrubajyoti Borthakur, Prakash Khemani, Zheng Shao
  • Publication number: 20140214752
    Abstract: Techniques for facilitating and accelerating log data processing by splitting data streams are disclosed herein. The front-end clusters generate large amount of log data in real time and transfer the log data to an aggregating cluster. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further splits the log data into a plurality of data streams so that the data streams are sent to a receiving application in parallel. In one embodiment, the log data are randomly split to ensure the log data are evenly distributed in the split data streams. In another embodiment, the application that receives the split data streams determines how to split the log data.
    Type: Application
    Filed: January 31, 2013
    Publication date: July 31, 2014
    Inventors: Samuel Rash, Dhrubajyoti Borthakur, Zheng Shao, Eric Hwang
  • Publication number: 20140188870
    Abstract: A variety of methods for improving efficiency in a database system are provided. In one embodiment, a method may comprise: generating multiple levels of data according to how recently the data have been updated, whereby most recently updated data are assigned to the newest level; storing each level of data in a specific storage tier; splitting data stored in a particular storage tier into two or more groups according to access statistics of each specific data; during compaction, storing data from different groups in separate data blocks of the particular storage tier; and when a particular data in a specific data block is requested, reading the specific data block into a low-latency storage tier.
    Type: Application
    Filed: December 28, 2012
    Publication date: July 3, 2014
    Inventors: Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Jeffrey Rothschild
  • Patent number: 8751897
    Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, and the L1EC blocks and the data blocks are divided into distribution sets and stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. Upon detecting a data disruption event, an inaccessible data storage location is identified and the elements that were stored at the inaccessible data storage location are reconstructed.
    Type: Grant
    Filed: October 18, 2013
    Date of Patent: June 10, 2014
    Assignee: Facebook Inc.
    Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
  • Publication number: 20140047266
    Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, and the L1EC blocks and the data blocks are divided into distribution sets and stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. Upon detecting a data disruption event, an inaccessible data storage location is identified and the elements that were stored at the inaccessible data storage location are reconstructed.
    Type: Application
    Filed: October 18, 2013
    Publication date: February 13, 2014
    Applicant: Facebook, Inc.
    Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
  • Publication number: 20140019495
    Abstract: Processing a file system operation is disclosed. An indication of a desired operation of a distributed file system is received. A metadata node for the desired operation is identified. It is indicated to the identified metadata node to process the desired operation. In the event the identified metadata node becomes not fully functional before the processing by the identified metadata node is confirmed, the distributed file system is analyzed to determine whether to indicate again to process the desired operation.
    Type: Application
    Filed: July 13, 2012
    Publication date: January 16, 2014
    Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
  • Publication number: 20140019405
    Abstract: Switching an active metadata node is disclosed. An indication that a standby metadata node of a distributed file system should replace an active metadata node of the distributed file system as a new active metadata node of the distributed file system is received. The standby metadata node is included in a server. A request that indicates that the standby metadata node would like to become an exclusive metadata node writer of a transaction log is sent. A confirmation that the standby metadata node is the exclusive metadata node writer of the transaction log is received. Based at least in part on the confirmation, an update that the standby metadata node has become the new active metadata node of the distributed file system is provided.
    Type: Application
    Filed: July 13, 2012
    Publication date: January 16, 2014
    Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
  • Patent number: 8595586
    Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, so long as sufficient undamaged elements are still accessible. The L1EC blocks and the data blocks are divided into distribution sets and these sets are stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. The L2EC blocks enable reconstruction of damaged elements at a data storage location without requiring communication with the other data storage locations.
    Type: Grant
    Filed: April 25, 2012
    Date of Patent: November 26, 2013
    Assignee: Facebook, Inc.
    Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
  • Publication number: 20130290805
    Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, so long as sufficient undamaged elements are still accessible. The L1EC blocks and the data blocks are divided into distribution sets and these sets are stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. The L2EC blocks enable reconstruction of damaged elements at a data storage location without requiring communication with the other data storage locations.
    Type: Application
    Filed: April 25, 2012
    Publication date: October 31, 2013
    Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
  • Patent number: 8484257
    Abstract: A system and method for generating extensible file system metadata. In one embodiment, the system may include a storage device configured to store data and a file system configured to manage access to the storage device and to store file system content. The file system may be further configured to detect a file system content access event, and in response to detecting the file system content access event, to generate a metadata record, where the metadata record is stored in an extensible, self-describing data format.
    Type: Grant
    Filed: June 7, 2004
    Date of Patent: July 9, 2013
    Assignee: Symantec Operating Corporation
    Inventors: Dhrubajyoti Borthakur, Nur Premo