Patents by Inventor Abdullah Reza

Abdullah Reza has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230080500
    Abstract: In some examples, a method comprises: receiving a request to read data within a specified range from a backup file storing at least one base snapshot and at least one incremental snapshot; looking up the specified range in range filters from the backup file, the range filters corresponding to snapshots stored in the backup file and each range filter comprising bits indicating whether data exists at respective ranges within the snapshot corresponding to the respective range filter; and in response to the looking up, reading the requested data from the looked-up range in the backup file.
    Type: Application
    Filed: September 13, 2021
    Publication date: March 16, 2023
    Inventors: Vijay Karthik, Abdullah Reza
  • Publication number: 20230076277
    Abstract: In some examples, a data management and storage (DMS) platform, comprises peer DMS nodes in a node cluster, a distributed data store comprising local and cloud storage, and at least one processor configured to perform operations in a method of creating a local consolidated patch file from a patch file chain stored in the cloud storage. The operations include, in a first dry-run phase, creating a logical patch file image of data blocks in one or more cloud patch files stored in the cloud storage; in a second data-transfer phase, downloading at least some of the data blocks from the cloud patch files identified by the logical patch file image, the second data-transfer phase comprising a coalescing operation to construct a set of coalesced reads of the data blocks; and creating and storing, in the local storage, the local consolidated patch file using the downloaded data blocks.
    Type: Application
    Filed: August 25, 2021
    Publication date: March 9, 2023
    Inventors: Bristy Sikder, Vijay Karthik, Abdullah Reza, Siddharth Bidasaria
  • Publication number: 20230017205
    Abstract: In some examples, a data management system generates snapshots in a distributed file system based on a protocol or a user triggered event, The data management system identifies a snappable file in a distributed file system and a first data block in the snappable file, the first data block including data and attribute data. The system scans an index file to access the attribute data of the first data block and initiates construction of a patch file based on the accessed attribute data. The system repeats the scanning of the index file to access attribute data of at least a further second data block, the second data block including data and attribute data, and completes construction of the patch file based on the accessed attribute data of the first and second data blocks. The system generates conversion simulation information by collecting attribute data for all the data blocks of the constructed patch file, and writes the simulation information to a patch file image.
    Type: Application
    Filed: July 19, 2021
    Publication date: January 19, 2023
    Inventors: Abdullah Reza, Vijay Karthik, Nitin Rathor, Vaibhav Gosain, Anshul Gupta
  • Publication number: 20220237087
    Abstract: In some examples, a data management and storage (DMS) platform comprises peer DMS nodes in a node cluster, a distributed data store comprising local and cloud storage, and at least one processor configured to perform operations in a method of creating a local consolidated patch file from a patch file chain stored in the cloud storage. Example operations comprise, in a first dry-run phase, creating a patch file image of data blocks in one or more cloud patch files stored in the cloud storage; in a second data-transfer phase, downloading at least some of the data blocks from the cloud patch files identified by the patch file image; and creating and storing, in the local storage, the local consolidated patch file using the downloaded data blocks.
    Type: Application
    Filed: January 25, 2021
    Publication date: July 28, 2022
    Inventors: Abdullah Reza, Vijay Karthik, Siddharth Bidasaria, Bristy Sikder
  • Patent number: 11321278
    Abstract: A lightweight deduplication system can perform resource efficient data deduplication using an extent index and a content index. The extent index can store full fingerprints of data segments to be deduplicated and the content index can store shortened versions of the full fingerprints. The system can alternate between the extent and content indexes, and cache portions of the indices to perform lightweight data deduplication. Further, the system can be configured with an efficient heuristic approach for selecting content index data lookups for chains of volumes for deduplication, such as a long chain of snapshots.
    Type: Grant
    Filed: April 29, 2020
    Date of Patent: May 3, 2022
    Assignee: RUBRIK, INC.
    Inventors: Anshul Gupta, Abdullah Reza, Guilherme Vale Ferreira Menezes
  • Patent number: 11269817
    Abstract: In one example, a method includes measuring an amount of physical storage space used, or expected to be used, by a portion of a dataset S of segments, and measuring the amount of physical storage space includes receiving information that identifies an ad-hoc group of size ‘n’ of files F1 . . . Fn that makes up a subset of the dataset S, determining a number of unique segments in the dataset S, identifying a respective unique segment set UF1 . . . UFN for each of the ‘n’ files in the ad-hoc group of files, performing a set union operation on the unique segment sets UF1 . . . UFN, and determining a sum of sizes of the unique segment sets UF1 . . . UFN, where the sum is the amount of physical storage space used or expected to be used by the ad-hoc group of size ‘n’ of files F1 . . . Fn.
    Type: Grant
    Filed: April 10, 2019
    Date of Patent: March 8, 2022
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Guilherme Menezes, Fabiano Botelho, Abdullah Reza
  • Publication number: 20210342297
    Abstract: A lightweight deduplication system can perform resource efficient data deduplication using an extent index and a content index. The extent index can store full fingerprints of data segments to be deduplicated and the content index can store shortened versions of the full fingerprints. The system can alternate between the extent and content indexes, and cache portions of the indices to perform lightweight data deduplication. Further, the system can be configured with an efficient heuristic approach for selecting content index data lookups for chains of volumes for deduplication, such as a long chain of snapshots.
    Type: Application
    Filed: April 29, 2020
    Publication date: November 4, 2021
    Inventors: Anshul Gupta, Abdullah Reza, Guilherme Vale Ferreira Menezes
  • Patent number: 11093453
    Abstract: A data management device includes a persistent storage and a processor. The persistent storage includes meta-data of data stored in a long term retention (LTR) storage. The processor obtains a file storage request for a file and deduplicates the file against segments stored in the LTR storage while performing garbage collection on the LTR storage. Performing garbage collection includes deleting segments of the data stored in the LTR storage using the meta-data. The meta-data is not stored in the LTR storage.
    Type: Grant
    Filed: August 31, 2017
    Date of Patent: August 17, 2021
    Assignee: EMC IP Holding Company LLC
    Inventors: Abdullah Reza, Abhinav Duggal, Lan Bai
  • Patent number: 10838923
    Abstract: Identifying files that do not deduplicate well in a storage system with deduplication facilitates optimizing storage capacity by moving the identified files to less expensive storage without deduplication. Any set of files can be examined to remove files that are identified as files that do not deduplicate well. The process of identification includes arranging the files in a predefined order and using bitmap representations of the unique segments in the files to determine a count of different segments in neighboring next files compared to the previous files, and removing from deduplication any next files that exceed a difference threshold. The bitmap representations of the files allows the identification processes to be performed efficiently for large datasets. Any over-identification of files is minimized by repeating the identification processes on the set of files after arranging them in the reverse order.
    Type: Grant
    Filed: December 18, 2015
    Date of Patent: November 17, 2020
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Guilherme Menezes, Abdullah Reza
  • Patent number: 10459648
    Abstract: File measurements are computed and stored in persistent memory of a deduplicated storage system as files are written or on demand, where the file measurements are used to estimate storage requirements for storing a subset of files. The file measurements are accumulated into an initial measurement at a first point in time and a final measurement at a second point in time to obtain an estimate of any change in a quantity of unique segments required to store the subset of files in the deduplicated storage system between the first and second points in time. Future storage requirements can be estimated based on a computed rate of change in the amount of storage required to store the subset of files between the first and second points in time.
    Type: Grant
    Filed: December 14, 2015
    Date of Patent: October 29, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Guilherme Menezes, Abdullah Reza
  • Patent number: 10430383
    Abstract: In one example, a method for processing data includes receiving information that identifies an ad hoc group of size ‘n’ of files F1 . . . Fn, each file F including a respective file sequence S that includes K data segments. Next, each file sequence S is sampled to obtain a sequence SS of data segments from the file sequence S, and a non-random sampling of data segments is sampled from each sequence SS to obtain a set SSU of the sequence SS. The data segments of each set SSU are then sampled to obtain a sample subset SSUS of the set SSU, and a compression ratio is determined for each data segment in each sample subset SSUS. Finally, an average data compression RF1 . . . Fn is estimated and output for the files F in the group of size ‘n’, based on the compression ratios.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: October 1, 2019
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Guilherme Menezes, Teng Xu, Abdullah Reza
  • Publication number: 20190236054
    Abstract: In one example, a method includes measuring an amount of physical storage space used, or expected to be used, by a portion of a dataset S of segments, and measuring the amount of physical storage space includes receiving information that identifies an ad-hoc group of size ‘n’ of files F1 . . . Fn that makes up a subset of the dataset S, determining a number of unique segments in the dataset S, identifying a respective unique segment set UF1 . . . UFN for each of the ‘n’ files in the ad-hoc group of files, performing a set union operation on the unique segment sets UF1 . . . UFN, and determining a sum of sizes of the unique segment sets UF1 . . . UFN, where the sum is the amount of physical storage space used or expected to be used by the ad-hoc group of size ‘n’ of files F1 . . . Fn.
    Type: Application
    Filed: April 10, 2019
    Publication date: August 1, 2019
    Inventors: Guilherme Menezes, Fabiano Botelho, Abdullah Reza
  • Patent number: 10303797
    Abstract: Clustering files in deduplication systems is based on an estimate of similarity between files in a file system. The estimates of similarity are based on how much content the files share, where the estimate of how much content is shared is based on an estimate of segments shared. The estimate of segments shared is based on segment offsets found in the files' bitmap vectors of segment offsets. The found segment offsets are used to generate a cluster definition approximating an optimal data structure for clustering files that share content. The approximated optimal data structure defines clusters hierarchically arranged based on the offset numbers of the found segment offsets.
    Type: Grant
    Filed: December 18, 2015
    Date of Patent: May 28, 2019
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Guilherme Menezes, Abdullah Reza
  • Patent number: 10303662
    Abstract: In one example, a method for processing data includes receiving information that identifies an ad-hoc group of size ‘n’ of files F1. . . Fn, each file F including a respective segment set S, and then sampling a representation of each unique segment in the segment set S to obtain a sampled unique segment count for each file F. A unique segment count is then obtained for each file F by applying a sampling ratio R to each sampled unique segment count, and an average segment size for each file F is determined. Next, a physical space measurement is generated for each file F based on the average segment size and the unique segment count, and then a total physical space measurement p is generated based on the individual physical space measurements for each file F.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: May 28, 2019
    Assignee: EMC IP HOLDING COMPANY LLC
    Inventors: Guilherme Menezes, Fabiano Botelho, Abdullah Reza