Patents by Inventor Dhrubajyoti Borthakur
Dhrubajyoti Borthakur has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10387416Abstract: Technology is disclosed for retrieving data from a specific storage layer of a storage system (“the technology”). A query application programming interface (API) is provided that allows an application to specify a storage layer on which the query should be executed. The query API can be used in a multi-threaded environment which employs a combination of fast threads and slow threads to serve read/write requests from applications. The fast threads are configured to query on a first set of storage layers, e.g., storage layers in a primary storage, while the slow threads are configured to query on a second set of storage layers, e.g., storage layers in a secondary storage. If a fast thread does not find the requested data in the first set, the request is transferred to a slow thread and the fast thread is allocated to another request while the slow thread is serving the current request.Type: GrantFiled: November 14, 2013Date of Patent: August 20, 2019Assignee: Facebook, Inc.Inventors: Mayank Agarwal, Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Haobo Xu
-
Patent number: 10346381Abstract: Technology is disclosed for performing atomic update operations in a storage system (“the technology”). The technology can receive an update command to update a value associated with a key stored in the storage system as a function of an input value; store the input value in a log stored at the storage system but not updating the value stored in the storage system; and update the value associated with the key with the received input values value based on the a function to generate an updated value, the updating occurring asynchronously with respect to receiving the update command.Type: GrantFiled: November 14, 2013Date of Patent: July 9, 2019Assignee: Facebook, Inc.Inventors: Deon Chris Nicholas, Haobo Xu, Dhrubajyoti Borthakur
-
Patent number: 9904689Abstract: Processing a file system operation is disclosed. An indication of a desired operation of a distributed file system is received. A metadata node for the desired operation is identified. It is indicated to the identified metadata node to process the desired operation. In the event the identified metadata node becomes not fully functional before the processing by the identified metadata node is confirmed, the distributed file system is analyzed to determine whether to indicate again to process the desired operation.Type: GrantFiled: July 13, 2012Date of Patent: February 27, 2018Assignee: Facebook, Inc.Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
-
Patent number: 9846711Abstract: A variety of methods for improving efficiency in a database system are provided. In one embodiment, a method may comprise: generating multiple levels of data according to how recently the data have been updated, whereby most recently updated data are assigned to the newest level; storing each level of data in a specific storage tier; splitting data stored in a particular storage tier into two or more groups according to access statistics of each specific data; during compaction, storing data from different groups in separate data blocks of the particular storage tier; and when a particular data in a specific data block is requested, reading the specific data block into a low-latency storage tier.Type: GrantFiled: December 28, 2012Date of Patent: December 19, 2017Assignee: Facebook, Inc.Inventors: Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Jeffrey Rothschild
-
Patent number: 9607001Abstract: Switching an active metadata node is disclosed. An indication that a standby metadata node of a distributed file system should replace an active metadata node of the distributed file system as a new active metadata node of the distributed file system is received. The standby metadata node is included in a server. A request that indicates that the standby metadata node would like to become an exclusive metadata node writer of a transaction log is sent. A confirmation that the standby metadata node is the exclusive metadata node writer of the transaction log is received. Based at least in part on the confirmation, an update that the standby metadata node has become the new active metadata node of the distributed file system is provided.Type: GrantFiled: July 13, 2012Date of Patent: March 28, 2017Assignee: Facebook, Inc.Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
-
Patent number: 9471436Abstract: A method and system on failure recovery in a storage system are disclosed. In the storage system, user data streams (e.g., log data) are collected by a scribeh system. The scribeh system may include a plurality of Calligraphus servers, HDFS and Zookeeper. The Calligraphus servers may shard the user data streams based on keys (e.g., category and bucket pairs) and stream the user data streams to Puma nodes. Sharded user data streams may be aggregated according to the keys in memory of a specific Puma node. Periodically, aggregated user data streams cached in memory of the specific Puma node, together with a Incremental checkpoint, are persisted to HBase. When a specific process on the specific Puma node fails, Ptail retrieves the Incremental checkpoint from HBase and then restores the specific process by requesting user data streams processed by the specific process from the scribeh system according to the Incremental checkpoint.Type: GrantFiled: April 23, 2013Date of Patent: October 18, 2016Inventors: Samuel Rash, Dhrubajyoti Borthakur, Prakash Khemani, Zheng Shao
-
Patent number: 9411840Abstract: The technology is directed to providing sequential access to data using scalable data structures. In some embodiments, the scalable data structures include a first data structure, e.g., hash map, and a second data structure, e.g., tree data structure (“tree”). The technology receives multiple key-value pairs representing data associated with an application. A key includes a prefix and a suffix. While the suffixes of the keys are distinct, some of the keys can have the same prefix. The technology stores the keys having the same prefix in a tree, and stores the root node of the tree in the first data structure. To retrieve values of a set of input keys with a given prefix, the technology retrieves a root node of a tree corresponding to the given prefix from the first data structure using the given prefix, and traverses the tree to obtain the values in a sequence.Type: GrantFiled: April 10, 2014Date of Patent: August 9, 2016Assignee: Facebook, Inc.Inventors: Wei Chen, Dhrubajyoti Borthakur
-
Publication number: 20150293958Abstract: The technology is directed to providing sequential access to data using scalable data structures. In some embodiments, the scalable data structures include a first data structure, e.g., hash map, and a second data structure, e.g., tree data structure (“tree”). The technology receives multiple key-value pairs representing data associated with an application. A key includes a prefix and a suffix. While the suffixes of the keys are distinct, some of the keys can have the same prefix. The technology stores the keys having the same prefix in a tree, and stores the root node of the tree in the first data structure. To retrieve values of a set of input keys with a given prefix, the technology retrieves a root node of a tree corresponding to the given prefix from the first data structure using the given prefix, and traverses the tree to obtain the values in a sequence.Type: ApplicationFiled: April 10, 2014Publication date: October 15, 2015Inventors: Wei Chen, Dhrubajyoti Borthakur
-
Publication number: 20150134692Abstract: Technology is disclosed for retrieving data from a specific storage layer of a storage system (“the technology”). A query application programming interface (API) is provided that allows an application to specify a storage layer on which the query should be executed. The query API can be used in a multi-threaded environment which employs a combination of fast threads and slow threads to serve read/write requests from applications. The fast threads are configured to query on a first set of storage layers, e.g., storage layers in a primary storage, while the slow threads are configured to query on a second set of storage layers, e.g., storage layers in a secondary storage. If a fast thread does not find the requested data in the first set, the request is transferred to a slow thread and the fast thread is allocated to another request while the slow thread is serving the current request.Type: ApplicationFiled: November 14, 2013Publication date: May 14, 2015Inventors: Mayank Agarwal, Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Haobo Xu
-
Publication number: 20150134602Abstract: Technology is disclosed for performing atomic update operations in a storage system (“the technology”). The technology can receive an update command to update a value associated with a key stored in the storage system as a function of an input value; store the input value in a log stored at the storage system but not updating the value stored in the storage system; and update the value associated with the key with the received input values value based on the a function to generate an updated value, the updating occurring asynchronously with respect to receiving the update command.Type: ApplicationFiled: November 14, 2013Publication date: May 14, 2015Inventors: Deon Chris Nicholas, Haobo Xu, Dhrubajyoti Borthakur
-
Publication number: 20140317448Abstract: A method and system on failure recovery in a storage system are disclosed. In the storage system, user data streams (e.g., log data) are collected by a scribeh system. The scribeh system may include a plurality of Calligraphus servers, HDFS and Zookeeper. The Calligraphus servers may shard the user data streams based on keys (e.g., category and bucket pairs) and stream the user data streams to Puma nodes. Sharded user data streams may be aggregated according to the keys in memory of a specific Puma node. Periodically, aggregated user data streams cached in memory of the specific Puma node, together with a Incremental checkpoint, are persisted to HBase. When a specific process on the specific Puma node fails, Ptail retrieves the Incremental checkpoint from HBase and then restores the specific process by requesting user data streams processed by the specific process from the scribeh system according to the Incremental checkpoint.Type: ApplicationFiled: April 23, 2013Publication date: October 23, 2014Applicant: Facebook, Inc.Inventors: Samuel Rash, Dhrubajyoti Borthakur, Prakash Khemani, Zheng Shao
-
Publication number: 20140214752Abstract: Techniques for facilitating and accelerating log data processing by splitting data streams are disclosed herein. The front-end clusters generate large amount of log data in real time and transfer the log data to an aggregating cluster. The aggregating cluster is designed to aggregate incoming log data streams from different front-end servers and clusters. The aggregating cluster further splits the log data into a plurality of data streams so that the data streams are sent to a receiving application in parallel. In one embodiment, the log data are randomly split to ensure the log data are evenly distributed in the split data streams. In another embodiment, the application that receives the split data streams determines how to split the log data.Type: ApplicationFiled: January 31, 2013Publication date: July 31, 2014Inventors: Samuel Rash, Dhrubajyoti Borthakur, Zheng Shao, Eric Hwang
-
Publication number: 20140188870Abstract: A variety of methods for improving efficiency in a database system are provided. In one embodiment, a method may comprise: generating multiple levels of data according to how recently the data have been updated, whereby most recently updated data are assigned to the newest level; storing each level of data in a specific storage tier; splitting data stored in a particular storage tier into two or more groups according to access statistics of each specific data; during compaction, storing data from different groups in separate data blocks of the particular storage tier; and when a particular data in a specific data block is requested, reading the specific data block into a low-latency storage tier.Type: ApplicationFiled: December 28, 2012Publication date: July 3, 2014Inventors: Dhrubajyoti Borthakur, Nagavamsi Ponnekanti, Jeffrey Rothschild
-
Patent number: 8751897Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, and the L1EC blocks and the data blocks are divided into distribution sets and stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. Upon detecting a data disruption event, an inaccessible data storage location is identified and the elements that were stored at the inaccessible data storage location are reconstructed.Type: GrantFiled: October 18, 2013Date of Patent: June 10, 2014Assignee: Facebook Inc.Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
-
Publication number: 20140047266Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, and the L1EC blocks and the data blocks are divided into distribution sets and stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. Upon detecting a data disruption event, an inaccessible data storage location is identified and the elements that were stored at the inaccessible data storage location are reconstructed.Type: ApplicationFiled: October 18, 2013Publication date: February 13, 2014Applicant: Facebook, Inc.Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
-
Publication number: 20140019495Abstract: Processing a file system operation is disclosed. An indication of a desired operation of a distributed file system is received. A metadata node for the desired operation is identified. It is indicated to the identified metadata node to process the desired operation. In the event the identified metadata node becomes not fully functional before the processing by the identified metadata node is confirmed, the distributed file system is analyzed to determine whether to indicate again to process the desired operation.Type: ApplicationFiled: July 13, 2012Publication date: January 16, 2014Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
-
Publication number: 20140019405Abstract: Switching an active metadata node is disclosed. An indication that a standby metadata node of a distributed file system should replace an active metadata node of the distributed file system as a new active metadata node of the distributed file system is received. The standby metadata node is included in a server. A request that indicates that the standby metadata node would like to become an exclusive metadata node writer of a transaction log is sent. A confirmation that the standby metadata node is the exclusive metadata node writer of the transaction log is received. Based at least in part on the confirmation, an update that the standby metadata node has become the new active metadata node of the distributed file system is provided.Type: ApplicationFiled: July 13, 2012Publication date: January 16, 2014Inventors: Dhrubajyoti Borthakur, Dmytro Molkov, Hairong Kuang
-
Patent number: 8595586Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, so long as sufficient undamaged elements are still accessible. The L1EC blocks and the data blocks are divided into distribution sets and these sets are stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. The L2EC blocks enable reconstruction of damaged elements at a data storage location without requiring communication with the other data storage locations.Type: GrantFiled: April 25, 2012Date of Patent: November 26, 2013Assignee: Facebook, Inc.Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
-
Publication number: 20130290805Abstract: Fault-tolerant storage is provided using a distributed data storage system that receives input data from clients and divides that data into data blocks for storage. The data blocks are processed using a coding scheme that generates redundant level one error correction blocks (L1EC Blocks). The L1EC blocks enable the reconstruction of one or more damaged or inaccessible data blocks, so long as sufficient undamaged elements are still accessible. The L1EC blocks and the data blocks are divided into distribution sets and these sets are stored at a plurality of data storage locations. At each data storage location additional level two error correction blocks (L2EC blocks) are generated that provide local data redundancy. The L2EC blocks enable reconstruction of damaged elements at a data storage location without requiring communication with the other data storage locations.Type: ApplicationFiled: April 25, 2012Publication date: October 31, 2013Inventors: Dhrubajyoti Borthakur, Per Brashers, Jason Matthew Taylor
-
Patent number: 8484257Abstract: A system and method for generating extensible file system metadata. In one embodiment, the system may include a storage device configured to store data and a file system configured to manage access to the storage device and to store file system content. The file system may be further configured to detect a file system content access event, and in response to detecting the file system content access event, to generate a metadata record, where the metadata record is stored in an extensible, self-describing data format.Type: GrantFiled: June 7, 2004Date of Patent: July 9, 2013Assignee: Symantec Operating CorporationInventors: Dhrubajyoti Borthakur, Nur Premo