Patents by Inventor Abdullah Reza

Abdullah Reza has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SNAPSHOT RANGE FILTERS

Publication number: 20230080500

Abstract: In some examples, a method comprises: receiving a request to read data within a specified range from a backup file storing at least one base snapshot and at least one incremental snapshot; looking up the specified range in range filters from the backup file, the range filters corresponding to snapshots stored in the backup file and each range filter comprising bits indicating whether data exists at respective ranges within the snapshot corresponding to the respective range filter; and in response to the looking up, reading the requested data from the looked-up range in the backup file.

Type: Application

Filed: September 13, 2021

Publication date: March 16, 2023

Inventors: Vijay Karthik, Abdullah Reza
SMART COALESCING IN DATA MANAGEMENT SYSTEMS

Publication number: 20230076277

Abstract: In some examples, a data management and storage (DMS) platform, comprises peer DMS nodes in a node cluster, a distributed data store comprising local and cloud storage, and at least one processor configured to perform operations in a method of creating a local consolidated patch file from a patch file chain stored in the cloud storage. The operations include, in a first dry-run phase, creating a logical patch file image of data blocks in one or more cloud patch files stored in the cloud storage; in a second data-transfer phase, downloading at least some of the data blocks from the cloud patch files identified by the logical patch file image, the second data-transfer phase comprising a coalescing operation to construct a set of coalesced reads of the data blocks; and creating and storing, in the local storage, the local consolidated patch file using the downloaded data blocks.

Type: Application

Filed: August 25, 2021

Publication date: March 9, 2023

Inventors: Bristy Sikder, Vijay Karthik, Abdullah Reza, Siddharth Bidasaria
ONLINE DATA FORMAT CONVERSION

Publication number: 20230017205

Abstract: In some examples, a data management system generates snapshots in a distributed file system based on a protocol or a user triggered event, The data management system identifies a snappable file in a distributed file system and a first data block in the snappable file, the first data block including data and attribute data. The system scans an index file to access the attribute data of the first data block and initiates construction of a patch file based on the accessed attribute data. The system repeats the scanning of the index file to access attribute data of at least a further second data block, the second data block including data and attribute data, and completes construction of the patch file based on the accessed attribute data of the first and second data blocks. The system generates conversion simulation information by collecting attribute data for all the data blocks of the constructed patch file, and writes the simulation information to a patch file image.

Type: Application

Filed: July 19, 2021

Publication date: January 19, 2023

Inventors: Abdullah Reza, Vijay Karthik, Nitin Rathor, Vaibhav Gosain, Anshul Gupta
TWO-PHASE SNAPSHOT RECOVERY

Publication number: 20220237087

Abstract: In some examples, a data management and storage (DMS) platform comprises peer DMS nodes in a node cluster, a distributed data store comprising local and cloud storage, and at least one processor configured to perform operations in a method of creating a local consolidated patch file from a patch file chain stored in the cloud storage. Example operations comprise, in a first dry-run phase, creating a patch file image of data blocks in one or more cloud patch files stored in the cloud storage; in a second data-transfer phase, downloading at least some of the data blocks from the cloud patch files identified by the patch file image; and creating and storing, in the local storage, the local consolidated patch file using the downloaded data blocks.

Type: Application

Filed: January 25, 2021

Publication date: July 28, 2022

Inventors: Abdullah Reza, Vijay Karthik, Siddharth Bidasaria, Bristy Sikder
Light-weight index deduplication and hierarchical snapshot replication

Patent number: 11321278

Abstract: A lightweight deduplication system can perform resource efficient data deduplication using an extent index and a content index. The extent index can store full fingerprints of data segments to be deduplicated and the content index can store shortened versions of the full fingerprints. The system can alternate between the extent and content indexes, and cache portions of the indices to perform lightweight data deduplication. Further, the system can be configured with an efficient heuristic approach for selecting content index data lookups for chains of volumes for deduplication, such as a long chain of snapshots.

Type: Grant

Filed: April 29, 2020

Date of Patent: May 3, 2022

Assignee: RUBRIK, INC.

Inventors: Anshul Gupta, Abdullah Reza, Guilherme Vale Ferreira Menezes
System and method for efficiently measuring physical space for an ad-hoc subset of files in protection storage filesystem with stream segmentation and data deduplication

Patent number: 11269817

Abstract: In one example, a method includes measuring an amount of physical storage space used, or expected to be used, by a portion of a dataset S of segments, and measuring the amount of physical storage space includes receiving information that identifies an ad-hoc group of size ‘n’ of files F1 . . . Fn that makes up a subset of the dataset S, determining a number of unique segments in the dataset S, identifying a respective unique segment set UF1 . . . UFN for each of the ‘n’ files in the ad-hoc group of files, performing a set union operation on the unique segment sets UF1 . . . UFN, and determining a sum of sizes of the unique segment sets UF1 . . . UFN, where the sum is the amount of physical storage space used or expected to be used by the ad-hoc group of size ‘n’ of files F1 . . . Fn.

Type: Grant

Filed: April 10, 2019

Date of Patent: March 8, 2022

Assignee: EMC IP HOLDING COMPANY LLC

Inventors: Guilherme Menezes, Fabiano Botelho, Abdullah Reza
LIGHT-WEIGHT INDEX DEDUPLICATION AND HIERARCHICAL SNAPSHOT REPLICATION

Publication number: 20210342297

Abstract: A lightweight deduplication system can perform resource efficient data deduplication using an extent index and a content index. The extent index can store full fingerprints of data segments to be deduplicated and the content index can store shortened versions of the full fingerprints. The system can alternate between the extent and content indexes, and cache portions of the indices to perform lightweight data deduplication. Further, the system can be configured with an efficient heuristic approach for selecting content index data lookups for chains of volumes for deduplication, such as a long chain of snapshots.

Type: Application

Filed: April 29, 2020

Publication date: November 4, 2021

Inventors: Anshul Gupta, Abdullah Reza, Guilherme Vale Ferreira Menezes
System and method for asynchronous cleaning of data objects on cloud partition in a file system with deduplication

Patent number: 11093453

Abstract: A data management device includes a persistent storage and a processor. The persistent storage includes meta-data of data stored in a long term retention (LTR) storage. The processor obtains a file storage request for a file and deduplicates the file against segments stored in the LTR storage while performing garbage collection on the LTR storage. Performing garbage collection includes deleting segments of the data stored in the LTR storage using the meta-data. The meta-data is not stored in the LTR storage.

Type: Grant

Filed: August 31, 2017

Date of Patent: August 17, 2021

Assignee: EMC IP Holding Company LLC

Inventors: Abdullah Reza, Abhinav Duggal, Lan Bai
Poor deduplication identification

Patent number: 10838923

Abstract: Identifying files that do not deduplicate well in a storage system with deduplication facilitates optimizing storage capacity by moving the identified files to less expensive storage without deduplication. Any set of files can be examined to remove files that are identified as files that do not deduplicate well. The process of identification includes arranging the files in a predefined order and using bitmap representations of the unique segments in the files to determine a count of different segments in neighboring next files compared to the previous files, and removing from deduplication any next files that exceed a difference threshold. The bitmap representations of the files allows the identification processes to be performed efficiently for large datasets. Any over-identification of files is minimized by repeating the identification processes on the set of files after arranging them in the reverse order.

Type: Grant

Filed: December 18, 2015

Date of Patent: November 17, 2020

Assignee: EMC IP HOLDING COMPANY LLC

Inventors: Guilherme Menezes, Abdullah Reza
Change rate estimation

Patent number: 10459648

Abstract: File measurements are computed and stored in persistent memory of a deduplicated storage system as files are written or on demand, where the file measurements are used to estimate storage requirements for storing a subset of files. The file measurements are accumulated into an initial measurement at a first point in time and a final measurement at a second point in time to obtain an estimate of any change in a quantity of unique segments required to store the subset of files in the deduplicated storage system between the first and second points in time. Future storage requirements can be estimated based on a computed rate of change in the amount of storage required to store the subset of files between the first and second points in time.

Type: Grant

Filed: December 14, 2015

Date of Patent: October 29, 2019

Assignee: EMC IP Holding Company LLC

Inventors: Guilherme Menezes, Abdullah Reza
Efficiently estimating data compression ratio of ad-hoc set of files in protection storage filesystem with stream segmentation and data deduplication

Patent number: 10430383

Abstract: In one example, a method for processing data includes receiving information that identifies an ad hoc group of size ‘n’ of files F1 . . . Fn, each file F including a respective file sequence S that includes K data segments. Next, each file sequence S is sampled to obtain a sequence SS of data segments from the file sequence S, and a non-random sampling of data segments is sampled from each sequence SS to obtain a set SSU of the sequence SS. The data segments of each set SSU are then sampled to obtain a sample subset SSUS of the set SSU, and a compression ratio is determined for each data segment in each sample subset SSUS. Finally, an average data compression RF1 . . . Fn is estimated and output for the files F in the group of size ‘n’, based on the compression ratios.

Type: Grant

Filed: September 30, 2015

Date of Patent: October 1, 2019

Assignee: EMC IP HOLDING COMPANY LLC

Inventors: Guilherme Menezes, Teng Xu, Abdullah Reza
SYSTEM AND METHOD FOR EFFICIENTLY MEASURING PHYSICAL SPACE FOR AN AD-HOC SUBSET OF FILES IN PROTECTION STORAGE FILESYSTEM WITH STREAM SEGMENTATION AND DATA DEDUPLICATION

Publication number: 20190236054

Abstract: In one example, a method includes measuring an amount of physical storage space used, or expected to be used, by a portion of a dataset S of segments, and measuring the amount of physical storage space includes receiving information that identifies an ad-hoc group of size ‘n’ of files F1 . . . Fn that makes up a subset of the dataset S, determining a number of unique segments in the dataset S, identifying a respective unique segment set UF1 . . . UFN for each of the ‘n’ files in the ad-hoc group of files, performing a set union operation on the unique segment sets UF1 . . . UFN, and determining a sum of sizes of the unique segment sets UF1 . . . UFN, where the sum is the amount of physical storage space used or expected to be used by the ad-hoc group of size ‘n’ of files F1 . . . Fn.

Type: Application

Filed: April 10, 2019

Publication date: August 1, 2019

Inventors: Guilherme Menezes, Fabiano Botelho, Abdullah Reza
Clustering files in deduplication systems

Patent number: 10303797

Abstract: Clustering files in deduplication systems is based on an estimate of similarity between files in a file system. The estimates of similarity are based on how much content the files share, where the estimate of how much content is shared is based on an estimate of segments shared. The estimate of segments shared is based on segment offsets found in the files' bitmap vectors of segment offsets. The found segment offsets are used to generate a cluster definition approximating an optimal data structure for clustering files that share content. The approximated optimal data structure defines clusters hierarchically arranged based on the offset numbers of the found segment offsets.

Type: Grant

Filed: December 18, 2015

Date of Patent: May 28, 2019

Assignee: EMC IP HOLDING COMPANY LLC

Inventors: Guilherme Menezes, Abdullah Reza
System and method for efficiently measuring physical space for an ad-hoc subset of files in protection storage filesystem with stream segmentation and data deduplication

Patent number: 10303662

Abstract: In one example, a method for processing data includes receiving information that identifies an ad-hoc group of size ‘n’ of files F1. . . Fn, each file F including a respective segment set S, and then sampling a representation of each unique segment in the segment set S to obtain a sampled unique segment count for each file F. A unique segment count is then obtained for each file F by applying a sampling ratio R to each sampled unique segment count, and an average segment size for each file F is determined. Next, a physical space measurement is generated for each file F based on the average segment size and the unique segment count, and then a total physical space measurement p is generated based on the individual physical space measurements for each file F.

Type: Grant

Filed: September 30, 2015

Date of Patent: May 28, 2019

Assignee: EMC IP HOLDING COMPANY LLC

Inventors: Guilherme Menezes, Fabiano Botelho, Abdullah Reza

prev 1 2